# IsoTree to TreeLite¶

This is a short example of converting an Isolation Forest model generated through the isotree library to treelite format, which can be used to compile these trees to a standalone runtime library which is oftentimes faster ar making predictions.

### Getting some medium-size data from scikit-learn to fit a model¶

In [1]:
import numpy as np
from sklearn.datasets import fetch_california_housing

X, y = fetch_california_housing(return_X_y=True)
print(X.shape)

(20640, 8)


### Fitting an isolation forest model through isotree¶

Note: only models that use ndim=1 can be exported to treelite format.

In [2]:
from isotree import IsolationForest

iso = IsolationForest(ndim=1, ntrees=100, sample_size=256,
missing_action="impute", max_depth=8)
iso.fit(X)

### Now convert
treelite_model = iso.to_treelite()

### OPTIONAL: add annotations for better branch prediction
import treelite, treelite_runtime
annotator = treelite.Annotator()
annotator.annotate_branch(
model=treelite_model,
dmat=treelite_runtime.DMatrix(X),
verbose=False
)
annotator.save(path="iso_branches_annotation.json")


### Compiling the treelite model¶

These models need to be compiled into a shared library in order to be used:

In [3]:
%%capture
import treelite_runtime
import multiprocessing

treelite_model.compile(
dirpath='.',
params={
"parallel_comp":multiprocessing.cpu_count(),
"annotate_in": "iso_branches_annotation.json"
}
)
treelite_model.export_lib("clang", ".")
treelite_predictor = treelite_runtime.Predictor("predictor.so")


Now verify that they make the same predictions:

In [4]:
iso.predict(X[:10])

Out[4]:
array([0.47006444, 0.47770081, 0.4910637 , 0.42605826, 0.41548625,
0.41730139, 0.41699421, 0.43228664, 0.40877799, 0.41800632])
In [5]:
treelite_predictor.predict(treelite_runtime.DMatrix(X[:10]))

Out[5]:
array([0.47006445, 0.47770081, 0.4910637 , 0.42605827, 0.41548626,
0.41730139, 0.41699421, 0.43228664, 0.40877799, 0.41800632])

Note: some small disagreement between the two is expected due to loss of precision when converting. See the documentation in isotree for more details.

### Comparing prediction times¶

In [6]:
%%timeit
import multiprocessing

31.6 ms ± 1.15 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%%timeit

4.41 ms ± 21.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)