This notebook explains how to correctly interpret split points that you might see in POJOs of H2O tree based models.
Motivation: we had seen there are users who are parsing H2O POJO and translating the Java code into another representation (SQL statements, ...). While we do not encourage users to use POJO in this particular use case we want to clarify how to interpret the numerical values correctly.
Computers and software like H2O use floating-point representation of real numbers. In this representation sequences of bits (0/1) are used to store the number with a limited precision. In H2O we use mainly 32-bit and 64-bit floating point number representation.
Lets take look at one example of a floating point number - 25.695312 and use 32-bit and 64-bit representation to compare the behavior.
import numpy as np
f32 = np.float32("25.695312")
f32
25.695312
f64 = np.float64("25.695312")
f64
25.695312
If we try to compare the numbers we will see they are not actually the same number
f32 == f64
False
When two numbers are compared their precion is first adjusted to be the same. This typically means the lower precison number is converted to the higher precision representation. In this case f32
will be converted to float64 representation. We can do the same thing explicitly:
np.float64(f32) == f64
False
The comparison failed because the converted number is actually different
np.float64(f32)
25.6953125
Notice the 7th decimal digit after the conversion.
np.float64(f32) - f64
4.999999987376214e-07
np.float64(f32) > f64
True
Understanding how computers compare numbers of different precision is critical for correctly interpretting split points in tree-based POJOs. Lets now train a simple GBM model.
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator
# Connect to a pre-existing cluster
h2o.init()
Checking whether there is an H2O instance running at http://localhost:54321 . connected.
H2O_cluster_uptime: | 09 secs |
H2O_cluster_timezone: | America/New_York |
H2O_data_parsing_timezone: | UTC |
H2O_cluster_version: | 3.35.0.99999 |
H2O_cluster_version_age: | 2 hours and 53 minutes |
H2O_cluster_name: | mkurka |
H2O_cluster_total_nodes: | 1 |
H2O_cluster_free_memory: | 7.094 Gb |
H2O_cluster_total_cores: | 16 |
H2O_cluster_allowed_cores: | 16 |
H2O_cluster_status: | locked, healthy |
H2O_connection_url: | http://localhost:54321 |
H2O_connection_proxy: | {"http": null, "https": null} |
H2O_internal_security: | False |
H2O_API_Extensions: | Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4 |
Python_version: | 3.8.2 final |
from h2o.utils.shared_utils import _locate # private function. used to find files within h2o git project directory.
df = h2o.upload_file(path=_locate("smalldata/logreg/prostate.csv"))
Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%
# Remove ID from training frame
train = df.drop("ID")
# For VOL & GLEASON, a zero really means "missing"
vol = train['VOL']
vol[vol == 0] = None
gle = train['GLEASON']
gle[gle == 0] = None
# Convert CAPSULE to a logical factor
train['CAPSULE'] = train['CAPSULE'].asfactor()
# Run GBM
my_gbm = H2OGradientBoostingEstimator(ntrees=1, seed=1234)
my_gbm.train(y="CAPSULE", training_frame=train)
gbm Model Build progress: |██████████████████████████████████████████████████████| (done) 100% Model Details ============= H2OGradientBoostingEstimator : Gradient Boosting Machine Model Key: GBM_model_python_1636137917875_1 Model Summary:
number_of_trees | number_of_internal_trees | model_size_in_bytes | min_depth | max_depth | mean_depth | min_leaves | max_leaves | mean_leaves | ||
---|---|---|---|---|---|---|---|---|---|---|
0 | 1.0 | 1.0 | 360.0 | 5.0 | 5.0 | 5.0 | 24.0 | 24.0 | 24.0 |
ModelMetricsBinomial: gbm ** Reported on train data. ** MSE: 0.22019689456071448 RMSE: 0.4692514193486414 LogLoss: 0.6319753099030868 Mean Per-Class Error: 0.20582476749877632 AUC: 0.8816907085888687 AUCPR: 0.8515845076604194 Gini: 0.7633814171777373 Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.4008312811161997:
0 | 1 | Error | Rate | ||
---|---|---|---|---|---|
0 | 0 | 176.0 | 51.0 | 0.2247 | (51.0/227.0) |
1 | 1 | 29.0 | 124.0 | 0.1895 | (29.0/153.0) |
2 | Total | 205.0 | 175.0 | 0.2105 | (80.0/380.0) |
Maximum Metrics: Maximum metrics at their respective thresholds
metric | threshold | value | idx | |
---|---|---|---|---|
0 | max f1 | 0.400831 | 0.756098 | 10.0 |
1 | max f2 | 0.379840 | 0.831486 | 16.0 |
2 | max f0point5 | 0.429293 | 0.783866 | 6.0 |
3 | max accuracy | 0.429293 | 0.807895 | 6.0 |
4 | max precision | 0.463528 | 1.000000 | 0.0 |
5 | max recall | 0.372774 | 1.000000 | 18.0 |
6 | max specificity | 0.463528 | 1.000000 | 0.0 |
7 | max absolute_mcc | 0.412406 | 0.595958 | 7.0 |
8 | max min_per_class_accuracy | 0.404036 | 0.777778 | 9.0 |
9 | max mean_per_class_accuracy | 0.404036 | 0.794175 | 9.0 |
10 | max tns | 0.463528 | 227.000000 | 0.0 |
11 | max fns | 0.463528 | 121.000000 | 0.0 |
12 | max fps | 0.363105 | 227.000000 | 19.0 |
13 | max tps | 0.372774 | 153.000000 | 18.0 |
14 | max tnr | 0.463528 | 1.000000 | 0.0 |
15 | max fnr | 0.463528 | 0.790850 | 0.0 |
16 | max fpr | 0.363105 | 1.000000 | 19.0 |
17 | max tpr | 0.372774 | 1.000000 | 18.0 |
Gains/Lift Table: Avg response rate: 40.26 %, avg score: 40.30 %
group | cumulative_data_fraction | lower_threshold | lift | cumulative_lift | response_rate | score | cumulative_response_rate | cumulative_score | capture_rate | cumulative_capture_rate | gain | cumulative_gain | kolmogorov_smirnov | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 0.084211 | 0.463528 | 2.483660 | 2.483660 | 1.000000 | 0.463528 | 1.000000 | 0.463528 | 0.209150 | 0.209150 | 148.366013 | 148.366013 | 0.209150 |
1 | 2 | 0.128947 | 0.457452 | 2.337562 | 2.432973 | 0.941176 | 0.457452 | 0.979592 | 0.461420 | 0.104575 | 0.313725 | 133.756248 | 143.297319 | 0.309320 |
2 | 3 | 0.157895 | 0.444791 | 2.032086 | 2.359477 | 0.818182 | 0.444791 | 0.950000 | 0.458372 | 0.058824 | 0.372549 | 103.208556 | 135.947712 | 0.359333 |
3 | 4 | 0.218421 | 0.432692 | 1.835749 | 2.214348 | 0.739130 | 0.436693 | 0.891566 | 0.452364 | 0.111111 | 0.483660 | 83.574879 | 121.434759 | 0.444013 |
4 | 5 | 0.300000 | 0.429622 | 1.682479 | 2.069717 | 0.677419 | 0.430389 | 0.833333 | 0.446389 | 0.137255 | 0.620915 | 68.247944 | 106.971678 | 0.537215 |
5 | 6 | 0.426316 | 0.404036 | 1.241830 | 1.824417 | 0.500000 | 0.412442 | 0.734568 | 0.436330 | 0.156863 | 0.777778 | 24.183007 | 82.441701 | 0.588350 |
6 | 7 | 0.521053 | 0.392412 | 0.827887 | 1.643230 | 0.333333 | 0.395728 | 0.661616 | 0.428948 | 0.078431 | 0.856209 | -17.211329 | 64.322968 | 0.561055 |
7 | 8 | 0.660526 | 0.383949 | 0.562338 | 1.414994 | 0.226415 | 0.385145 | 0.569721 | 0.419699 | 0.078431 | 0.934641 | -43.766186 | 41.499362 | 0.458870 |
8 | 9 | 0.763158 | 0.379840 | 0.445785 | 1.284652 | 0.179487 | 0.380533 | 0.517241 | 0.414432 | 0.045752 | 0.980392 | -55.421485 | 28.465179 | 0.363652 |
9 | 10 | 0.813158 | 0.373285 | 0.261438 | 1.221736 | 0.105263 | 0.373285 | 0.491909 | 0.411902 | 0.013072 | 0.993464 | -73.856209 | 22.173573 | 0.301834 |
10 | 11 | 1.000000 | 0.363105 | 0.034981 | 1.000000 | 0.014085 | 0.364467 | 0.402632 | 0.403039 | 0.006536 | 1.000000 | -96.501887 | 0.000000 | 0.000000 |
Scoring History:
timestamp | duration | number_of_trees | training_rmse | training_logloss | training_auc | training_pr_auc | training_lift | training_classification_error | ||
---|---|---|---|---|---|---|---|---|---|---|
0 | 2021-11-05 14:45:28 | 0.022 sec | 0.0 | 0.490428 | 0.674064 | 0.500000 | 0.402632 | 1.00000 | 0.597368 | |
1 | 2021-11-05 14:45:28 | 0.182 sec | 1.0 | 0.469251 | 0.631975 | 0.881691 | 0.851585 | 2.48366 | 0.210526 |
Variable Importances:
variable | relative_importance | scaled_importance | percentage | |
---|---|---|---|---|
0 | GLEASON | 20.125320 | 1.000000 | 0.496931 |
1 | PSA | 8.138151 | 0.404374 | 0.200946 |
2 | VOL | 6.416112 | 0.318808 | 0.158426 |
3 | DPROS | 5.819649 | 0.289170 | 0.143698 |
4 | AGE | 0.000000 | 0.000000 | 0.000000 |
5 | RACE | 0.000000 | 0.000000 | 0.000000 |
6 | DCAPS | 0.000000 | 0.000000 | 0.000000 |
# Get the POJO
my_gbm.download_pojo()
/* Licensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0.html AUTOGENERATED BY H2O at 2021-11-05T14:45:28.555-04:00 3.35.0.99999 Standalone prediction code with sample test data for GBMModel named GBM_model_python_1636137917875_1 How to download, compile and execute: mkdir tmpdir cd tmpdir curl http://192.168.86.229:54321/3/h2o-genmodel.jar > h2o-genmodel.jar curl http://192.168.86.229:54321/3/Models.java/GBM_model_python_1636137917875_1 > GBM_model_python_1636137917875_1.java javac -cp h2o-genmodel.jar -J-Xmx2g -J-XX:MaxPermSize=128m GBM_model_python_1636137917875_1.java (Note: Try java argument -XX:+PrintCompilation to show runtime JIT compiler behavior.) */ import java.util.Map; import hex.genmodel.GenModel; import hex.genmodel.annotations.ModelPojo; @ModelPojo(name="GBM_model_python_1636137917875_1", algorithm="gbm") public class GBM_model_python_1636137917875_1 extends GenModel { public hex.ModelCategory getModelCategory() { return hex.ModelCategory.Binomial; } public boolean isSupervised() { return true; } public int nfeatures() { return 7; } public int nclasses() { return 2; } // Names of columns used by model. public static final String[] NAMES = NamesHolder_GBM_model_python_1636137917875_1.VALUES; // Number of output classes included in training data response column. public static final int NCLASSES = 2; // Column domains. The last array contains domain of response column. public static final String[][] DOMAINS = new String[][] { /* AGE */ null, /* RACE */ null, /* DPROS */ null, /* DCAPS */ null, /* PSA */ null, /* VOL */ null, /* GLEASON */ null, /* CAPSULE */ GBM_model_python_1636137917875_1_ColInfo_7.VALUES }; // Prior class distribution public static final double[] PRIOR_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684}; // Class distribution used for model building public static final double[] MODEL_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684}; public GBM_model_python_1636137917875_1() { super(NAMES,DOMAINS,"CAPSULE"); } public String getUUID() { return Long.toString(4988040225257658559L); } // Pass in data in a double[], pre-aligned to the Model's requirements. // Jam predictions into the preds[] array; preds[0] is reserved for the // main prediction (class for classifiers or value for regression), // and remaining columns hold a probability distribution for classifiers. public final double[] score0( double[] data, double[] preds ) { java.util.Arrays.fill(preds,0); GBM_model_python_1636137917875_1_Forest_0.score0(data,preds); preds[2] = preds[1] + -0.3945120960889672; preds[2] = 1./(1. + Math.min(1e19, Math.exp(-(preds[2])))); preds[1] = 1.0-preds[2]; preds[0] = hex.genmodel.GenModel.getPrediction(preds, PRIOR_CLASS_DISTRIB, data, 0.4008312811161997); return preds; } } // The class representing training column names class NamesHolder_GBM_model_python_1636137917875_1 implements java.io.Serializable { public static final String[] VALUES = new String[7]; static { NamesHolder_GBM_model_python_1636137917875_1_0.fill(VALUES); } static final class NamesHolder_GBM_model_python_1636137917875_1_0 implements java.io.Serializable { static final void fill(String[] sa) { sa[0] = "AGE"; sa[1] = "RACE"; sa[2] = "DPROS"; sa[3] = "DCAPS"; sa[4] = "PSA"; sa[5] = "VOL"; sa[6] = "GLEASON"; } } } // The class representing column CAPSULE class GBM_model_python_1636137917875_1_ColInfo_7 implements java.io.Serializable { public static final String[] VALUES = new String[2]; static { GBM_model_python_1636137917875_1_ColInfo_7_0.fill(VALUES); } static final class GBM_model_python_1636137917875_1_ColInfo_7_0 implements java.io.Serializable { static final void fill(String[] sa) { sa[0] = "0"; sa[1] = "1"; } } } class GBM_model_python_1636137917875_1_Forest_0 { public static void score0(double[] fdata, double[] preds) { preds[1] += GBM_model_python_1636137917875_1_Tree_0_class_0.score0(fdata); } } class GBM_model_python_1636137917875_1_Tree_0_class_0 { static final double score0(double[] data) { double pred = (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 6.5f ? (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5f ? (data[6 /* GLEASON */] < 5.5f ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.44375f ? -0.16740088f : (data[5 /* VOL */] < 35.319237f ? -0.0842475f : -0.16740088f)) : (data[2 /* DPROS */] < 1.5f ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 25.695312f ? -0.09571693f : -0.16740088f) : (Double.isNaN(data[5]) || data[5 /* VOL */] < 23.359375f ? -0.07830798f : 0.005835324f))) : (Double.isNaN(data[4]) || data[4 /* PSA */] < 6.645508f ? (data[5 /* VOL */] < 4.4484377f ? -0.007490538f : (data[4 /* PSA */] < 3.6390624f ? -0.16740088f : -0.1258242f)) : (data[6 /* GLEASON */] < 5.5f ? -0.05400991f : (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.1625f ? 0.17277204f : 0.040482566f)))) : (data[4 /* PSA */] < 14.730078f ? (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5f ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 7.1585937f ? (data[4 /* PSA */] < 7.995f ? 0.10977705f : -0.039472606f) : -0.12363595f) : (Double.isNaN(data[5]) || data[5 /* VOL */] < 17.264843f ? (Double.isNaN(data[4]) || data[4 /* PSA */] < 8.267187f ? 0.1524198f : -0.042670812f) : 0.22390914f)) : (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 7.5f ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 24.657812f ? (data[4 /* PSA */] < 18.55625f ? 0.24836601f : 0.11424766f) : 0.01078493f) : (data[4 /* PSA */] < 22.60625f ? 0.12363594f : 0.24836601f)))); return pred; } // constant pool size = 94B, number of visited nodes = 23, static init size = 0B }
Please take a close look at the POJO code, you should see statements like this one
Double.isNaN(data[5]) || data[5 /* VOL */] < 25.695312f ? -0.09571693f : -0.16740088f
This code represents one split decision in a GBM tree. data
represents a single input row. The split decision is looking a column VOL
to decide whether the observation should go to the left sub-tree or go right based on the value of element 5 in the data
array.
It is important to notice that data
is defined as a double array:
double[] data
This means data is represented by 64-bit floating point numbers.
The split point itself is however outputted in 32-bit precision. In java code we capture this fact by using f
suffix in the number representation, eg.: 25.695312f
.
This means we have the same scenario as outlined in the beginning of this notebook - we are comparing numbers with two different precisions and we need to pay attention to how the numbers are interpreted.
data = np.array([0, 0, 0, 0, 0, np.float64(25.695312)])
data[5]
25.695312
The java comparison rewritten to Python would look like this:
data[5] < np.float32(25.695312)
True
This means that observation represented by array data
should got the left subtree of the current node. If we ignored the fact that the split point is using 32-bit precision and considered it as 64-bit precision, we would miclassify the observation to left sub-tree.
data[5] < np.float64(25.695312)
False
H2O allows users to modify the POJO output by setting a property sys.ai.h2o.java.output.doubles
. Setting this property to true
will cause the POJO generator to output split point in 64-bit precision (doubles) instead of the default 32-bit precision.
We can set this property even on a running H2O instance by invoking a rapids expression.
h2o.rapids("(setproperty \"{}\" \"{}\")".format("sys.ai.h2o.java.output.doubles", "true"))["string"]
'Old values of sys.ai.h2o.java.output.doubles (per node): null'
my_gbm.download_pojo()
/* Licensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0.html AUTOGENERATED BY H2O at 2021-11-05T14:45:28.619-04:00 3.35.0.99999 Standalone prediction code with sample test data for GBMModel named GBM_model_python_1636137917875_1 How to download, compile and execute: mkdir tmpdir cd tmpdir curl http://192.168.86.229:54321/3/h2o-genmodel.jar > h2o-genmodel.jar curl http://192.168.86.229:54321/3/Models.java/GBM_model_python_1636137917875_1 > GBM_model_python_1636137917875_1.java javac -cp h2o-genmodel.jar -J-Xmx2g -J-XX:MaxPermSize=128m GBM_model_python_1636137917875_1.java (Note: Try java argument -XX:+PrintCompilation to show runtime JIT compiler behavior.) */ import java.util.Map; import hex.genmodel.GenModel; import hex.genmodel.annotations.ModelPojo; @ModelPojo(name="GBM_model_python_1636137917875_1", algorithm="gbm") public class GBM_model_python_1636137917875_1 extends GenModel { public hex.ModelCategory getModelCategory() { return hex.ModelCategory.Binomial; } public boolean isSupervised() { return true; } public int nfeatures() { return 7; } public int nclasses() { return 2; } // Names of columns used by model. public static final String[] NAMES = NamesHolder_GBM_model_python_1636137917875_1.VALUES; // Number of output classes included in training data response column. public static final int NCLASSES = 2; // Column domains. The last array contains domain of response column. public static final String[][] DOMAINS = new String[][] { /* AGE */ null, /* RACE */ null, /* DPROS */ null, /* DCAPS */ null, /* PSA */ null, /* VOL */ null, /* GLEASON */ null, /* CAPSULE */ GBM_model_python_1636137917875_1_ColInfo_7.VALUES }; // Prior class distribution public static final double[] PRIOR_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684}; // Class distribution used for model building public static final double[] MODEL_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684}; public GBM_model_python_1636137917875_1() { super(NAMES,DOMAINS,"CAPSULE"); } public String getUUID() { return Long.toString(4988040225257658559L); } // Pass in data in a double[], pre-aligned to the Model's requirements. // Jam predictions into the preds[] array; preds[0] is reserved for the // main prediction (class for classifiers or value for regression), // and remaining columns hold a probability distribution for classifiers. public final double[] score0( double[] data, double[] preds ) { java.util.Arrays.fill(preds,0); GBM_model_python_1636137917875_1_Forest_0.score0(data,preds); preds[2] = preds[1] + -0.3945120960889672; preds[2] = 1./(1. + Math.min(1e19, Math.exp(-(preds[2])))); preds[1] = 1.0-preds[2]; preds[0] = hex.genmodel.GenModel.getPrediction(preds, PRIOR_CLASS_DISTRIB, data, 0.4008312811161997); return preds; } } // The class representing training column names class NamesHolder_GBM_model_python_1636137917875_1 implements java.io.Serializable { public static final String[] VALUES = new String[7]; static { NamesHolder_GBM_model_python_1636137917875_1_0.fill(VALUES); } static final class NamesHolder_GBM_model_python_1636137917875_1_0 implements java.io.Serializable { static final void fill(String[] sa) { sa[0] = "AGE"; sa[1] = "RACE"; sa[2] = "DPROS"; sa[3] = "DCAPS"; sa[4] = "PSA"; sa[5] = "VOL"; sa[6] = "GLEASON"; } } } // The class representing column CAPSULE class GBM_model_python_1636137917875_1_ColInfo_7 implements java.io.Serializable { public static final String[] VALUES = new String[2]; static { GBM_model_python_1636137917875_1_ColInfo_7_0.fill(VALUES); } static final class GBM_model_python_1636137917875_1_ColInfo_7_0 implements java.io.Serializable { static final void fill(String[] sa) { sa[0] = "0"; sa[1] = "1"; } } } class GBM_model_python_1636137917875_1_Forest_0 { public static void score0(double[] fdata, double[] preds) { preds[1] += GBM_model_python_1636137917875_1_Tree_0_class_0.score0(fdata); } } class GBM_model_python_1636137917875_1_Tree_0_class_0 { static final double score0(double[] data) { double pred = (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 6.5 ? (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5 ? (data[6 /* GLEASON */] < 5.5 ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.443750381469727 ? -0.16740088164806366 : (data[5 /* VOL */] < 35.319236755371094 ? -0.08424749970436096 : -0.16740088164806366)) : (data[2 /* DPROS */] < 1.5 ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 25.6953125 ? -0.0957169309258461 : -0.16740088164806366) : (Double.isNaN(data[5]) || data[5 /* VOL */] < 23.359375 ? -0.07830797880887985 : 0.005835324060171843))) : (Double.isNaN(data[4]) || data[4 /* PSA */] < 6.6455078125 ? (data[5 /* VOL */] < 4.448437690734863 ? -0.007490538060665131 : (data[4 /* PSA */] < 3.6390624046325684 ? -0.16740088164806366 : -0.1258241981267929)) : (data[6 /* GLEASON */] < 5.5 ? -0.05400991067290306 : (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.162500381469727 ? 0.17277203500270844 : 0.04048256576061249)))) : (data[4 /* PSA */] < 14.730077743530273 ? (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5 ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 7.158593654632568 ? (data[4 /* PSA */] < 7.994999885559082 ? 0.1097770482301712 : -0.03947260603308678) : -0.12363594770431519) : (Double.isNaN(data[5]) || data[5 /* VOL */] < 17.264842987060547 ? (Double.isNaN(data[4]) || data[4 /* PSA */] < 8.267187118530273 ? 0.1524198055267334 : -0.04267081245779991) : 0.2239091396331787)) : (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 7.5 ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 24.657812118530273 ? (data[4 /* PSA */] < 18.556249618530273 ? 0.24836601316928864 : 0.11424765735864639) : 0.010784929618239403) : (data[4 /* PSA */] < 22.606250762939453 ? 0.12363594025373459 : 0.24836601316928864)))); return pred; } // constant pool size = 94B, number of visited nodes = 23, static init size = 0B }
In the modified POJO output you can now see the original split is coded as
Double.isNaN(data[5]) || data[5 /* VOL */] < 25.6953125 ? -0.0957169309258461 : -0.16740088164806366
Notice the last decimal place and observer there is now no suffix f
at the end of the number. Compare it to the original version
Double.isNaN(data[5]) || data[5 /* VOL */] < 25.695312f ? -0.09571693f : -0.16740088f
The 64-bit precision output might be more natural to users for understanding what the POJO is doing when deciding how should a given observation traverse the tree.
Suppose we already have a MOJO model that was created by an older H2O version and we want to see how would the POJO look like with numbers represented in 64-bits.
For this use case H2O provides a conversion tool MojoConvertTool
as a part of the h2o.jar
.
mojo_path = my_gbm.download_mojo()
mojo_path
'/Users/mkurka/git/h2o/h2o-3/GBM_model_python_1636137917875_1.zip'
# Find h2o.jar (this is using internal functions)
from h2o.backend import H2OLocalServer
h2o_jar = H2OLocalServer()._find_jar()
# Invoke MojoConvertTool without arguments to print out usage instructions
import subprocess
subprocess.call(["java", "-cp", h2o_jar, "water.tools.MojoConvertTool"], stderr=subprocess.STDOUT, shell=False)
java -cp h2o.jar water.tools.MojoConvertTool source_mojo.zip target_pojo.java
1
# Add path to MOJO file and write output to "pojo.java"
subprocess.call(["java", "-cp", h2o_jar, "water.tools.MojoConvertTool", mojo_path, "pojo.java"], stderr=subprocess.STDOUT, shell=False)
Starting local H2O instance to facilitate MOJO to POJO conversion. 14:45:29.416 [main] INFO hex.tree.xgboost.util.NativeLibrary - Loaded library from lib/osx_64/libxgboost4j_minimal.dylib (/var/folders/v1/fkjmcbkd11v2mrm4dm6345ym0000gn/T/libxgboost4j_minimal6279070988842798503.dylib) 11-05 14:45:29.543 127.0.0.1:54321 79164 main INFO water.default: ----- H2O started ----- 11-05 14:45:29.544 127.0.0.1:54321 79164 main INFO water.default: Build git branch: master 11-05 14:45:29.544 127.0.0.1:54321 79164 main INFO water.default: Build git hash: b9ba1af5f07c6dbc6369e41113ea43947109e054 11-05 14:45:29.544 127.0.0.1:54321 79164 main INFO water.default: Build git describe: jenkins-master-5625-7-gb9ba1af5f0 11-05 14:45:29.544 127.0.0.1:54321 79164 main INFO water.default: Build project version: 3.35.0.99999 11-05 14:45:29.544 127.0.0.1:54321 79164 main INFO water.default: Build age: 2 hours and 53 minutes 11-05 14:45:29.545 127.0.0.1:54321 79164 main INFO water.default: Built by: 'mkurka' 11-05 14:45:29.545 127.0.0.1:54321 79164 main INFO water.default: Built on: '2021-11-05 11:51:34' 11-05 14:45:29.545 127.0.0.1:54321 79164 main INFO water.default: Found H2O Core extensions: [XGBoost, KrbStandalone] 11-05 14:45:29.545 127.0.0.1:54321 79164 main INFO water.default: Processed H2O arguments: [-disable_web, -ip, localhost, -disable_net] 11-05 14:45:29.545 127.0.0.1:54321 79164 main INFO water.default: Java availableProcessors: 16 11-05 14:45:29.545 127.0.0.1:54321 79164 main INFO water.default: Java heap totalMemory: 491.0 MB 11-05 14:45:29.546 127.0.0.1:54321 79164 main INFO water.default: Java heap maxMemory: 7.11 GB 11-05 14:45:29.546 127.0.0.1:54321 79164 main INFO water.default: Java version: Java 1.8.0_311 (from Oracle Corporation) 11-05 14:45:29.546 127.0.0.1:54321 79164 main INFO water.default: JVM launch parameters: [] 11-05 14:45:29.546 127.0.0.1:54321 79164 main INFO water.default: JVM process id: 79164@michals-mbp.lan 11-05 14:45:29.546 127.0.0.1:54321 79164 main INFO water.default: OS version: Mac OS X 10.16 (x86_64) 11-05 14:45:29.547 127.0.0.1:54321 79164 main INFO water.default: Machine physical memory: 32.00 GB 11-05 14:45:29.548 127.0.0.1:54321 79164 main INFO water.default: Machine locale: en_US 11-05 14:45:29.549 127.0.0.1:54321 79164 main INFO water.default: X-h2o-cluster-id: 1636137928927 11-05 14:45:29.549 127.0.0.1:54321 79164 main INFO water.default: User name: 'mkurka' 11-05 14:45:29.549 127.0.0.1:54321 79164 main INFO water.default: IPv6 stack selected: false 11-05 14:45:29.549 127.0.0.1:54321 79164 main INFO water.default: H2O node running in unencrypted mode. 11-05 14:45:30.081 127.0.0.1:54321 79164 main INFO water.default: Kerberos not configured 11-05 14:45:30.081 127.0.0.1:54321 79164 main INFO water.default: Log dir: '/tmp/h2o-mkurka/h2ologs' 11-05 14:45:30.081 127.0.0.1:54321 79164 main INFO water.default: Cur dir: '/Users/mkurka/git/h2o/h2o-3' 11-05 14:45:30.087 127.0.0.1:54321 79164 main INFO water.default: Subsystem for distributed import from HTTP/HTTPS successfully initialized 11-05 14:45:30.088 127.0.0.1:54321 79164 main INFO water.default: HDFS subsystem successfully initialized 11-05 14:45:30.090 127.0.0.1:54321 79164 main INFO water.default: S3 subsystem successfully initialized 11-05 14:45:30.102 127.0.0.1:54321 79164 main INFO water.default: GCS subsystem successfully initialized 11-05 14:45:30.103 127.0.0.1:54321 79164 main INFO water.default: Flow dir: '/Users/mkurka/h2oflows' 11-05 14:45:30.108 127.0.0.1:54321 79164 main INFO water.default: Cloud of size 1 formed [localhost/127.0.0.1:54321] 11-05 14:45:30.116 127.0.0.1:54321 79164 main INFO water.default: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, CSV] 11-05 14:45:30.117 127.0.0.1:54321 79164 main INFO water.default: XGBoost extension initialized 11-05 14:45:30.117 127.0.0.1:54321 79164 main INFO water.default: KrbStandalone extension initialized 11-05 14:45:30.117 127.0.0.1:54321 79164 main INFO water.default: Registered 2 core extensions in: 377ms 11-05 14:45:30.117 127.0.0.1:54321 79164 main INFO water.default: Registered H2O core extensions: [XGBoost, KrbStandalone] 11-05 14:45:30.305 127.0.0.1:54321 79164 main INFO hex.tree.xgboost.XGBoostExtension: Found XGBoost backend with library: xgboost4j_minimal 11-05 14:45:30.306 127.0.0.1:54321 79164 main WARN hex.tree.xgboost.XGBoostExtension: Your system supports only minimal version of XGBoost (no GPUs, no multithreading)! 11-05 14:45:30.406 127.0.0.1:54321 79164 main INFO water.default: Registered: 257 REST APIs in: 289ms 11-05 14:45:30.406 127.0.0.1:54321 79164 main INFO water.default: Registered REST API extensions: [Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4] 11-05 14:45:30.506 127.0.0.1:54321 79164 main INFO water.default: Registered: 311 schemas in 100ms 11-05 14:45:30.506 127.0.0.1:54321 79164 main INFO water.default: Locking cloud to new members, because H2O is started in a single node configuration. Converting /Users/mkurka/git/h2o/h2o-3/GBM_model_python_1636137917875_1.zip to pojo.java... 11-05 14:45:30.642 127.0.0.1:54321 79164 FJ-1-15 INFO water.default: Starting model Generic_model_1636137928927_1 11-05 14:45:30.747 127.0.0.1:54321 79164 FJ-1-15 INFO water.default: Completing model Generic_model_1636137928927_1 DONE
0
# Display the content of the POJO
with open('pojo.java', 'r') as f:
print(f.read())
/* Licensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0.html AUTOGENERATED BY H2O at 2021-11-05T14:45:30.759-04:00 3.35.0.99999 Standalone prediction code with sample test data for GBMModel named Generic_model_1636137928927_1 How to download, compile and execute: mkdir tmpdir cd tmpdir curl http:/localhost/127.0.0.1:54321/3/h2o-genmodel.jar > h2o-genmodel.jar curl http:/localhost/127.0.0.1:54321/3/Models.java/Generic_model_1636137928927_1 > Generic_model_1636137928927_1.java javac -cp h2o-genmodel.jar -J-Xmx2g -J-XX:MaxPermSize=128m Generic_model_1636137928927_1.java (Note: Try java argument -XX:+PrintCompilation to show runtime JIT compiler behavior.) */ import java.util.Map; import hex.genmodel.GenModel; import hex.genmodel.annotations.ModelPojo; @ModelPojo(name="Generic_model_1636137928927_1", algorithm="gbm") public class Generic_model_1636137928927_1 extends GenModel { public hex.ModelCategory getModelCategory() { return hex.ModelCategory.Binomial; } public boolean isSupervised() { return true; } public int nfeatures() { return 7; } public int nclasses() { return 2; } // Names of columns used by model. public static final String[] NAMES = NamesHolder_Generic_model_1636137928927_1.VALUES; // Number of output classes included in training data response column. public static final int NCLASSES = 2; // Column domains. The last array contains domain of response column. public static final String[][] DOMAINS = new String[][] { /* AGE */ null, /* RACE */ null, /* DPROS */ null, /* DCAPS */ null, /* PSA */ null, /* VOL */ null, /* GLEASON */ null, /* CAPSULE */ Generic_model_1636137928927_1_ColInfo_7.VALUES }; // Prior class distribution public static final double[] PRIOR_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684}; // Class distribution used for model building public static final double[] MODEL_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684}; public Generic_model_1636137928927_1() { super(NAMES,DOMAINS,"CAPSULE"); } public String getUUID() { return Long.toString(4988040225257658559L); } // Pass in data in a double[], pre-aligned to the Model's requirements. // Jam predictions into the preds[] array; preds[0] is reserved for the // main prediction (class for classifiers or value for regression), // and remaining columns hold a probability distribution for classifiers. public final double[] score0( double[] data, double[] preds ) { java.util.Arrays.fill(preds,0); Generic_model_1636137928927_1_Forest_0.score0(data,preds); preds[2] = preds[1] + -0.3945120960889672; preds[2] = 1./(1. + Math.min(1e19, Math.exp(-(preds[2])))); preds[1] = 1.0-preds[2]; preds[0] = hex.genmodel.GenModel.getPrediction(preds, PRIOR_CLASS_DISTRIB, data, 0.4008312811161997); return preds; } } // The class representing training column names class NamesHolder_Generic_model_1636137928927_1 implements java.io.Serializable { public static final String[] VALUES = new String[7]; static { NamesHolder_Generic_model_1636137928927_1_0.fill(VALUES); } static final class NamesHolder_Generic_model_1636137928927_1_0 implements java.io.Serializable { static final void fill(String[] sa) { sa[0] = "AGE"; sa[1] = "RACE"; sa[2] = "DPROS"; sa[3] = "DCAPS"; sa[4] = "PSA"; sa[5] = "VOL"; sa[6] = "GLEASON"; } } } // The class representing column CAPSULE class Generic_model_1636137928927_1_ColInfo_7 implements java.io.Serializable { public static final String[] VALUES = new String[2]; static { Generic_model_1636137928927_1_ColInfo_7_0.fill(VALUES); } static final class Generic_model_1636137928927_1_ColInfo_7_0 implements java.io.Serializable { static final void fill(String[] sa) { sa[0] = "0"; sa[1] = "1"; } } } class Generic_model_1636137928927_1_Forest_0 { public static void score0(double[] fdata, double[] preds) { preds[1] += Generic_model_1636137928927_1_Tree_0_class_0.score0(fdata); } } class Generic_model_1636137928927_1_Tree_0_class_0 { static final double score0(double[] data) { double pred = (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 6.5f ? (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5f ? (data[6 /* GLEASON */] < 5.5f ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.44375f ? -0.16740088f : (data[5 /* VOL */] < 35.319237f ? -0.0842475f : -0.16740088f)) : (data[2 /* DPROS */] < 1.5f ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 25.695312f ? -0.09571693f : -0.16740088f) : (Double.isNaN(data[5]) || data[5 /* VOL */] < 23.359375f ? -0.07830798f : 0.005835324f))) : (Double.isNaN(data[4]) || data[4 /* PSA */] < 6.645508f ? (data[5 /* VOL */] < 4.4484377f ? -0.007490538f : (data[4 /* PSA */] < 3.6390624f ? -0.16740088f : -0.1258242f)) : (data[6 /* GLEASON */] < 5.5f ? -0.05400991f : (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.1625f ? 0.17277204f : 0.040482566f)))) : (data[4 /* PSA */] < 14.730078f ? (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5f ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 7.1585937f ? (data[4 /* PSA */] < 7.995f ? 0.10977705f : -0.039472606f) : -0.12363595f) : (Double.isNaN(data[5]) || data[5 /* VOL */] < 17.264843f ? (Double.isNaN(data[4]) || data[4 /* PSA */] < 8.267187f ? 0.1524198f : -0.042670812f) : 0.22390914f)) : (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 7.5f ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 24.657812f ? (data[4 /* PSA */] < 18.55625f ? 0.24836601f : 0.11424766f) : 0.01078493f) : (data[4 /* PSA */] < 22.60625f ? 0.12363594f : 0.24836601f)))); return pred; } // constant pool size = 94B, number of visited nodes = 23, static init size = 0B }
# Now specify system property sys.ai.h2o.java.output.doubles to output numbers in 64-bit precision
subprocess.call(["java", "-Dsys.ai.h2o.java.output.doubles=true", "-cp", h2o_jar, "water.tools.MojoConvertTool", mojo_path, "pojo64.java"], stderr=subprocess.STDOUT, shell=False)
Starting local H2O instance to facilitate MOJO to POJO conversion. 14:45:31.502 [main] INFO hex.tree.xgboost.util.NativeLibrary - Loaded library from lib/osx_64/libxgboost4j_minimal.dylib (/var/folders/v1/fkjmcbkd11v2mrm4dm6345ym0000gn/T/libxgboost4j_minimal978915340387551523.dylib) 11-05 14:45:31.628 127.0.0.1:54321 79166 main INFO water.default: ----- H2O started ----- 11-05 14:45:31.628 127.0.0.1:54321 79166 main INFO water.default: Build git branch: master 11-05 14:45:31.628 127.0.0.1:54321 79166 main INFO water.default: Build git hash: b9ba1af5f07c6dbc6369e41113ea43947109e054 11-05 14:45:31.628 127.0.0.1:54321 79166 main INFO water.default: Build git describe: jenkins-master-5625-7-gb9ba1af5f0 11-05 14:45:31.629 127.0.0.1:54321 79166 main INFO water.default: Build project version: 3.35.0.99999 11-05 14:45:31.629 127.0.0.1:54321 79166 main INFO water.default: Build age: 2 hours and 53 minutes 11-05 14:45:31.629 127.0.0.1:54321 79166 main INFO water.default: Built by: 'mkurka' 11-05 14:45:31.629 127.0.0.1:54321 79166 main INFO water.default: Built on: '2021-11-05 11:51:34' 11-05 14:45:31.629 127.0.0.1:54321 79166 main INFO water.default: Found H2O Core extensions: [XGBoost, KrbStandalone] 11-05 14:45:31.629 127.0.0.1:54321 79166 main INFO water.default: Processed H2O arguments: [-disable_web, -ip, localhost, -disable_net] 11-05 14:45:31.630 127.0.0.1:54321 79166 main INFO water.default: Java availableProcessors: 16 11-05 14:45:31.630 127.0.0.1:54321 79166 main INFO water.default: Java heap totalMemory: 491.0 MB 11-05 14:45:31.630 127.0.0.1:54321 79166 main INFO water.default: Java heap maxMemory: 7.11 GB 11-05 14:45:31.630 127.0.0.1:54321 79166 main INFO water.default: Java version: Java 1.8.0_311 (from Oracle Corporation) 11-05 14:45:31.630 127.0.0.1:54321 79166 main INFO water.default: JVM launch parameters: [-Dsys.ai.h2o.java.output.doubles=true] 11-05 14:45:31.631 127.0.0.1:54321 79166 main INFO water.default: JVM process id: 79166@michals-mbp.lan 11-05 14:45:31.631 127.0.0.1:54321 79166 main INFO water.default: OS version: Mac OS X 10.16 (x86_64) 11-05 14:45:31.631 127.0.0.1:54321 79166 main INFO water.default: Machine physical memory: 32.00 GB 11-05 14:45:31.631 127.0.0.1:54321 79166 main INFO water.default: Machine locale: en_US 11-05 14:45:31.633 127.0.0.1:54321 79166 main INFO water.default: X-h2o-cluster-id: 1636137931013 11-05 14:45:31.633 127.0.0.1:54321 79166 main INFO water.default: User name: 'mkurka' 11-05 14:45:31.633 127.0.0.1:54321 79166 main INFO water.default: IPv6 stack selected: false 11-05 14:45:31.633 127.0.0.1:54321 79166 main INFO water.default: H2O node running in unencrypted mode. 11-05 14:45:32.130 127.0.0.1:54321 79166 main INFO water.default: Kerberos not configured 11-05 14:45:32.130 127.0.0.1:54321 79166 main INFO water.default: Log dir: '/tmp/h2o-mkurka/h2ologs' 11-05 14:45:32.130 127.0.0.1:54321 79166 main INFO water.default: Cur dir: '/Users/mkurka/git/h2o/h2o-3' 11-05 14:45:32.136 127.0.0.1:54321 79166 main INFO water.default: Subsystem for distributed import from HTTP/HTTPS successfully initialized 11-05 14:45:32.137 127.0.0.1:54321 79166 main INFO water.default: HDFS subsystem successfully initialized 11-05 14:45:32.140 127.0.0.1:54321 79166 main INFO water.default: S3 subsystem successfully initialized 11-05 14:45:32.151 127.0.0.1:54321 79166 main INFO water.default: GCS subsystem successfully initialized 11-05 14:45:32.152 127.0.0.1:54321 79166 main INFO water.default: Flow dir: '/Users/mkurka/h2oflows' 11-05 14:45:32.158 127.0.0.1:54321 79166 main INFO water.default: Cloud of size 1 formed [localhost/127.0.0.1:54321] 11-05 14:45:32.164 127.0.0.1:54321 79166 main INFO water.default: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, CSV] 11-05 14:45:32.165 127.0.0.1:54321 79166 main INFO water.default: XGBoost extension initialized 11-05 14:45:32.166 127.0.0.1:54321 79166 main INFO water.default: KrbStandalone extension initialized 11-05 14:45:32.166 127.0.0.1:54321 79166 main INFO water.default: Registered 2 core extensions in: 376ms 11-05 14:45:32.166 127.0.0.1:54321 79166 main INFO water.default: Registered H2O core extensions: [XGBoost, KrbStandalone] 11-05 14:45:32.353 127.0.0.1:54321 79166 main INFO hex.tree.xgboost.XGBoostExtension: Found XGBoost backend with library: xgboost4j_minimal 11-05 14:45:32.353 127.0.0.1:54321 79166 main WARN hex.tree.xgboost.XGBoostExtension: Your system supports only minimal version of XGBoost (no GPUs, no multithreading)! 11-05 14:45:32.467 127.0.0.1:54321 79166 main INFO water.default: Registered: 257 REST APIs in: 301ms 11-05 14:45:32.467 127.0.0.1:54321 79166 main INFO water.default: Registered REST API extensions: [Amazon S3, XGBoost, Algos, AutoML, Core V3, TargetEncoder, Core V4] 11-05 14:45:32.564 127.0.0.1:54321 79166 main INFO water.default: Registered: 311 schemas in 96ms 11-05 14:45:32.565 127.0.0.1:54321 79166 main INFO water.default: Locking cloud to new members, because H2O is started in a single node configuration. Converting /Users/mkurka/git/h2o/h2o-3/GBM_model_python_1636137917875_1.zip to pojo64.java... 11-05 14:45:32.700 127.0.0.1:54321 79166 FJ-1-15 INFO water.default: Starting model Generic_model_1636137931013_1 11-05 14:45:32.804 127.0.0.1:54321 79166 FJ-1-15 INFO water.default: Completing model Generic_model_1636137931013_1 DONE
0
# Display the content of the POJO with 64-bit number representation
with open('pojo64.java', 'r') as f:
print(f.read())
/* Licensed under the Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0.html AUTOGENERATED BY H2O at 2021-11-05T14:45:32.815-04:00 3.35.0.99999 Standalone prediction code with sample test data for GBMModel named Generic_model_1636137931013_1 How to download, compile and execute: mkdir tmpdir cd tmpdir curl http:/localhost/127.0.0.1:54321/3/h2o-genmodel.jar > h2o-genmodel.jar curl http:/localhost/127.0.0.1:54321/3/Models.java/Generic_model_1636137931013_1 > Generic_model_1636137931013_1.java javac -cp h2o-genmodel.jar -J-Xmx2g -J-XX:MaxPermSize=128m Generic_model_1636137931013_1.java (Note: Try java argument -XX:+PrintCompilation to show runtime JIT compiler behavior.) */ import java.util.Map; import hex.genmodel.GenModel; import hex.genmodel.annotations.ModelPojo; @ModelPojo(name="Generic_model_1636137931013_1", algorithm="gbm") public class Generic_model_1636137931013_1 extends GenModel { public hex.ModelCategory getModelCategory() { return hex.ModelCategory.Binomial; } public boolean isSupervised() { return true; } public int nfeatures() { return 7; } public int nclasses() { return 2; } // Names of columns used by model. public static final String[] NAMES = NamesHolder_Generic_model_1636137931013_1.VALUES; // Number of output classes included in training data response column. public static final int NCLASSES = 2; // Column domains. The last array contains domain of response column. public static final String[][] DOMAINS = new String[][] { /* AGE */ null, /* RACE */ null, /* DPROS */ null, /* DCAPS */ null, /* PSA */ null, /* VOL */ null, /* GLEASON */ null, /* CAPSULE */ Generic_model_1636137931013_1_ColInfo_7.VALUES }; // Prior class distribution public static final double[] PRIOR_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684}; // Class distribution used for model building public static final double[] MODEL_CLASS_DISTRIB = {0.5973684210526315,0.4026315789473684}; public Generic_model_1636137931013_1() { super(NAMES,DOMAINS,"CAPSULE"); } public String getUUID() { return Long.toString(4988040225257658559L); } // Pass in data in a double[], pre-aligned to the Model's requirements. // Jam predictions into the preds[] array; preds[0] is reserved for the // main prediction (class for classifiers or value for regression), // and remaining columns hold a probability distribution for classifiers. public final double[] score0( double[] data, double[] preds ) { java.util.Arrays.fill(preds,0); Generic_model_1636137931013_1_Forest_0.score0(data,preds); preds[2] = preds[1] + -0.3945120960889672; preds[2] = 1./(1. + Math.min(1e19, Math.exp(-(preds[2])))); preds[1] = 1.0-preds[2]; preds[0] = hex.genmodel.GenModel.getPrediction(preds, PRIOR_CLASS_DISTRIB, data, 0.4008312811161997); return preds; } } // The class representing training column names class NamesHolder_Generic_model_1636137931013_1 implements java.io.Serializable { public static final String[] VALUES = new String[7]; static { NamesHolder_Generic_model_1636137931013_1_0.fill(VALUES); } static final class NamesHolder_Generic_model_1636137931013_1_0 implements java.io.Serializable { static final void fill(String[] sa) { sa[0] = "AGE"; sa[1] = "RACE"; sa[2] = "DPROS"; sa[3] = "DCAPS"; sa[4] = "PSA"; sa[5] = "VOL"; sa[6] = "GLEASON"; } } } // The class representing column CAPSULE class Generic_model_1636137931013_1_ColInfo_7 implements java.io.Serializable { public static final String[] VALUES = new String[2]; static { Generic_model_1636137931013_1_ColInfo_7_0.fill(VALUES); } static final class Generic_model_1636137931013_1_ColInfo_7_0 implements java.io.Serializable { static final void fill(String[] sa) { sa[0] = "0"; sa[1] = "1"; } } } class Generic_model_1636137931013_1_Forest_0 { public static void score0(double[] fdata, double[] preds) { preds[1] += Generic_model_1636137931013_1_Tree_0_class_0.score0(fdata); } } class Generic_model_1636137931013_1_Tree_0_class_0 { static final double score0(double[] data) { double pred = (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 6.5 ? (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5 ? (data[6 /* GLEASON */] < 5.5 ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.443750381469727 ? -0.16740088164806366 : (data[5 /* VOL */] < 35.319236755371094 ? -0.08424749970436096 : -0.16740088164806366)) : (data[2 /* DPROS */] < 1.5 ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 25.6953125 ? -0.0957169309258461 : -0.16740088164806366) : (Double.isNaN(data[5]) || data[5 /* VOL */] < 23.359375 ? -0.07830797880887985 : 0.005835324060171843))) : (Double.isNaN(data[4]) || data[4 /* PSA */] < 6.6455078125 ? (data[5 /* VOL */] < 4.448437690734863 ? -0.007490538060665131 : (data[4 /* PSA */] < 3.6390624046325684 ? -0.16740088164806366 : -0.1258241981267929)) : (data[6 /* GLEASON */] < 5.5 ? -0.05400991067290306 : (Double.isNaN(data[5]) || data[5 /* VOL */] < 19.162500381469727 ? 0.17277203500270844 : 0.04048256576061249)))) : (data[4 /* PSA */] < 14.730077743530273 ? (Double.isNaN(data[2]) || data[2 /* DPROS */] < 2.5 ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 7.158593654632568 ? (data[4 /* PSA */] < 7.994999885559082 ? 0.1097770482301712 : -0.03947260603308678) : -0.12363594770431519) : (Double.isNaN(data[5]) || data[5 /* VOL */] < 17.264842987060547 ? (Double.isNaN(data[4]) || data[4 /* PSA */] < 8.267187118530273 ? 0.1524198055267334 : -0.04267081245779991) : 0.2239091396331787)) : (Double.isNaN(data[6]) || data[6 /* GLEASON */] < 7.5 ? (Double.isNaN(data[5]) || data[5 /* VOL */] < 24.657812118530273 ? (data[4 /* PSA */] < 18.556249618530273 ? 0.24836601316928864 : 0.11424765735864639) : 0.010784929618239403) : (data[4 /* PSA */] < 22.606250762939453 ? 0.12363594025373459 : 0.24836601316928864)))); return pred; } // constant pool size = 94B, number of visited nodes = 23, static init size = 0B }