automatminer 2019.10.14
and matminer 0.6.2
, python 3.7.3
on MacOS Mojave 10.14.6
)¶Automatminer is a package for automatically creating ML pipelines using matminer's featurizers, feature reduction techniques, and Automated Machine Learning (AutoML). Automatminer works end to end - raw data to prediction - without any human input necessary.
Automatminer is competitive with state of the art hand-tuned machine learning models across multiple domains of materials informatics. Automatminer also included utilities for running MatBench, a materials science ML benchmark.
Automatminer automatically decorates a dataset using hundreds of descriptor techniques from matminer’s descriptor library, picks the most useful features for learning, and runs a separate AutoML pipeline. Once a pipeline has been fit, it can be summarized in a text file, saved to disk, or used to make predictions on new materials.
Materials primitives (e.g., crystal structures) go in one end, and property predictions come out the other. MatPipe handles the intermediate operations such as assigning descriptors, cleaning problematic data, data conversions, imputation, and machine learning.
MatPipe
is the central object in Automatminer. It has a sklearn BaseEstimator syntax for fit
and predict
operations. Simply fit
on your training data, then predict
on your testing data.
Put dataframes (of materials) in, get dataframes (of property predictions) out.
In this notebook, we walk through the basic steps of using Automatminer to train and predict on data. We'll also view the internals of our AutoML pipeline using Automatminer's API.
MatPipe
(pipeline) to the dataMatPipe
's introspection methods.Note: for the sake of brevity, we will use a single train-test split in this notebook. To run a full Automatminer benchmark, see the documentation for MatPipe.benchmark
Let's load a dataset to play around with. For this example, we will use matminer to load one of the MatBench v0.1 datasets. If you have been through some of machine learning or data retrieval tutorials on this repo, you will be familiar with the commands needed to fetch our dataset as a dataframe.
from matminer.datasets import load_dataset
df = load_dataset("matbench_expt_gap")
# Let's look at our dataset
df.describe()
gap expt | |
---|---|
count | 4604.000000 |
mean | 0.975951 |
std | 1.445034 |
min | 0.000000 |
25% | 0.000000 |
50% | 0.000000 |
75% | 1.812500 |
max | 11.700000 |
df.head()
composition | gap expt | |
---|---|---|
0 | Ag(AuS)2 | 0.00 |
1 | Ag(W3Br7)2 | 0.00 |
2 | Ag0.5Ge1Pb1.75S4 | 1.83 |
3 | Ag0.5Ge1Pb1.75Se4 | 1.51 |
4 | Ag2BBr | 0.00 |
We should find all the compositions are unique.
# How many unique compositions do we have?
df["composition"].unique().shape[0]
4604
from sklearn.model_selection import train_test_split
train_df, test_df = train_test_split(df, test_size=0.2, shuffle=True, random_state=20191014)
Let's remove the testing dataframe's target property so we can be sure we are not giving Automatminer any test information.
Our target variable is "gap expt"
.
target = "gap expt"
prediction_df = test_df.drop(columns=[target])
prediction_df.head()
composition | |
---|---|
4514 | ZnSb |
834 | Co1Te1.88 |
4481 | Zn2Ni9O13 |
3958 | TiAlAu2 |
3087 | Pr(MnSi)2 |
prediction_df.describe()
composition | |
---|---|
count | 921 |
unique | 921 |
top | La2GeSe5 |
freq | 1 |
Our dataset contains 4,604 unique stoichiometries and experimentally measured band gaps. We have everything we need to start our AutoML pipeline.
For simplicity, we will use an MatPipe
preset. MatPipe
is highly customizable and has hundreds of configuration options, but most use cases will be satisfied by using one of the preset configurations. We use the from_preset
method.
In this example, we'll use the "express" preset, which will take approximately an hour.
from automatminer import MatPipe
pipe = MatPipe.from_preset("express")
/Users/ardunn/alex/lbl/projects/common_env/common_env3/lib/python3.7/site-packages/sklearn/externals/joblib/__init__.py:15: DeprecationWarning: sklearn.externals.joblib is deprecated in 0.21 and will be removed in 0.23. Please import this functionality directly from joblib, which can be installed with: pip install joblib. If this warning is raised when loading pickled models, you may need to re-serialize those models with scikit-learn 0.21+.
To fit an Automatminer MatPipe
to the data, pass in your training data and desired target.
pipe.fit(train_df, target)
2019-10-14 20:51:56 INFO Problem type is: regression 2019-10-14 20:51:56 INFO Fitting MatPipe pipeline to data. 2019-10-14 20:51:56 INFO AutoFeaturizer: Starting fitting. 2019-10-14 20:51:56 INFO AutoFeaturizer: Compositions detected as strings. Attempting conversion to Composition objects...
HBox(children=(IntProgress(value=0, description='StrToComposition', max=3683, style=ProgressStyle(description_…
2019-10-14 20:51:56 INFO AutoFeaturizer: Guessing oxidation states of compositions, as they were not present in input.
HBox(children=(IntProgress(value=0, description='CompositionToOxidComposition', max=3683, style=ProgressStyle(…
2019-10-14 20:52:55 INFO AutoFeaturizer: Will remove YangSolidSolution because it's fraction passing the precheck for this dataset (0.4051045343469997) was less than the minimum (0.9) 2019-10-14 20:52:55 INFO AutoFeaturizer: Will remove Miedema because it's fraction passing the precheck for this dataset (0.4051045343469997) was less than the minimum (0.9) 2019-10-14 20:52:55 INFO AutoFeaturizer: Featurizer type structure not in the dataframe to be fitted. Skipping... 2019-10-14 20:52:55 INFO AutoFeaturizer: Featurizer type bandstructure not in the dataframe to be fitted. Skipping... 2019-10-14 20:52:55 INFO AutoFeaturizer: Featurizer type dos not in the dataframe to be fitted. Skipping... 2019-10-14 20:52:55 INFO AutoFeaturizer: Finished fitting. 2019-10-14 20:52:55 INFO AutoFeaturizer: Starting transforming. 2019-10-14 20:52:55 INFO AutoFeaturizer: Featurizing with ElementProperty.
HBox(children=(IntProgress(value=0, description='ElementProperty', max=3683, style=ProgressStyle(description_w…
2019-10-14 20:53:03 INFO AutoFeaturizer: Featurizing with OxidationStates.
HBox(children=(IntProgress(value=0, description='OxidationStates', max=3683, style=ProgressStyle(description_w…
2019-10-14 20:53:03 INFO AutoFeaturizer: Featurizing with ElectronAffinity.
HBox(children=(IntProgress(value=0, description='ElectronAffinity', max=3683, style=ProgressStyle(description_…
2019-10-14 20:53:03 INFO AutoFeaturizer: Featurizing with IonProperty.
HBox(children=(IntProgress(value=0, description='IonProperty', max=3683, style=ProgressStyle(description_width…
2019-10-14 20:53:15 INFO AutoFeaturizer: Featurizer type structure not in the dataframe. Skipping... 2019-10-14 20:53:15 INFO AutoFeaturizer: Featurizer type bandstructure not in the dataframe. Skipping... 2019-10-14 20:53:15 INFO AutoFeaturizer: Featurizer type dos not in the dataframe. Skipping... 2019-10-14 20:53:15 INFO AutoFeaturizer: Finished transforming. 2019-10-14 20:53:15 INFO DataCleaner: Starting fitting. 2019-10-14 20:53:15 INFO DataCleaner: Cleaning with respect to samples with sample na_method 'drop' 2019-10-14 20:53:15 INFO DataCleaner: Replacing infinite values with nan for easier screening. 2019-10-14 20:53:16 INFO DataCleaner: Before handling na: 3683 samples, 141 features 2019-10-14 20:53:16 INFO DataCleaner: 0 samples did not have target values. They were dropped. 2019-10-14 20:53:16 INFO DataCleaner: Handling feature na by max na threshold of 0.01 with method 'drop'. 2019-10-14 20:53:16 INFO DataCleaner: These 8 features were removed as they had more than 1.0% missing values: {'avg ionic char', 'compound possible', 'std_dev oxidation state', 'minimum oxidation state', 'avg anion electron affinity', 'maximum oxidation state', 'max ionic char', 'range oxidation state'} 2019-10-14 20:53:16 INFO DataCleaner: After handling na: 3683 samples, 133 features 2019-10-14 20:53:16 INFO DataCleaner: Finished fitting. 2019-10-14 20:53:16 INFO FeatureReducer: Starting fitting. 2019-10-14 20:53:16 INFO FeatureReducer: 18 features removed due to cross correlation more than 0.95
/Users/ardunn/alex/lbl/projects/common_env/common_env3/lib/python3.7/site-packages/sklearn/ensemble/forest.py:245: FutureWarning: The default value of n_estimators will change from 10 in version 0.20 to 100 in 0.22.
2019-10-14 20:54:21 INFO TreeFeatureReducer: Finished tree-based feature reduction of 114 initial features to 48 2019-10-14 20:54:21 INFO FeatureReducer: Finished fitting. 2019-10-14 20:54:21 INFO FeatureReducer: Starting transforming. 2019-10-14 20:54:21 INFO FeatureReducer: Finished transforming. 2019-10-14 20:54:21 INFO TPOTAdaptor: Starting fitting. 28 operators have been imported by TPOT.
HBox(children=(IntProgress(value=0, description='Optimization Progress', max=20, style=ProgressStyle(descripti…
_pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. _pre_test decorator: _random_mutation_operator: num_test=1 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required by StandardScaler.. Skipped pipeline #38 due to time out. Continuing to the next pipeline. Generation 1 - Current Pareto front scores: -3 -0.5019658700076691 RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20) Generation 2 - Current Pareto front scores: -3 -0.49662437839581725 RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000) _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. Generation 3 - Current Pareto front scores: -3 -0.4704077886260397 RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20) _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. Generation 4 - Current Pareto front scores: -3 -0.4684578070615303 RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500) Pipeline encountered that has previously been evaluated during the optimization process. Using the score from the previous evaluation. Generation 5 - Current Pareto front scores: -3 -0.46727537793493007 RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000) Generation 6 - Current Pareto front scores: -3 -0.46136176692378017 RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000) _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. Generation 7 - Current Pareto front scores: -3 -0.455318603917173 RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200) _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. Skipped pipeline #164 due to time out. Continuing to the next pipeline. Skipped pipeline #167 due to time out. Continuing to the next pipeline. Generation 8 - Current Pareto front scores: -3 -0.455318603917173 RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200) _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. _pre_test decorator: _random_mutation_operator: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required.. 61.48513473333333 minutes have elapsed. TPOT will close down. TPOT closed during evaluation in one generation. WARNING: TPOT may not provide a good pipeline if TPOT is stopped/interrupted in a early generation. TPOT closed prematurely. Will use the current best pipeline. 2019-10-14 21:56:26 INFO TPOTAdaptor: Finished fitting. 2019-10-14 21:56:26 INFO MatPipe successfully fit.
MatPipe(autofeaturizer=AutoFeaturizer(bandstructure_col=None, cache_src=None, composition_col='composition', do_precheck=True, dos_col='dos', drop_inputs=True, exclude=[], featurizers={'bandstructure': [BandFeaturizer(find_method='nearest', kpoints=None, nbands=2), BranchPointEnergy(atol=1e-05, calculate_band_edges=True, n_cb=1, n_vb=1)], 'composition': [ElementPr... max_na_frac=0.01, na_method_fit='drop', na_method_transform='fill'), learner=TPOTAdaptor(logger=<Logger automatminer (INFO)>), logger=<Logger automatminer (INFO)>, reducer=FeatureReducer(corr_threshold=0.95, keep_features=None, logger=<Logger automatminer (INFO)>, n_pca_features='auto', n_rebate_features=0.3, reducers=('corr', 'tree'), remove_features=None, tree_importance_percentile=0.99))
Our MatPipe is now fit. Let's predict our test data with MatPipe.predict
. This should only take a few minutes.
prediction_df = pipe.predict(prediction_df)
2019-10-14 21:56:56 INFO Beginning MatPipe prediction using fitted pipeline. 2019-10-14 21:56:56 INFO AutoFeaturizer: Starting transforming. 2019-10-14 21:56:56 INFO AutoFeaturizer: Compositions detected as strings. Attempting conversion to Composition objects...
HBox(children=(IntProgress(value=0, description='StrToComposition', max=921, style=ProgressStyle(description_w…
2019-10-14 21:56:57 INFO AutoFeaturizer: Guessing oxidation states of compositions, as they were not present in input.
HBox(children=(IntProgress(value=0, description='CompositionToOxidComposition', max=921, style=ProgressStyle(d…
2019-10-14 21:57:08 INFO AutoFeaturizer: Featurizing with ElementProperty.
HBox(children=(IntProgress(value=0, description='ElementProperty', max=921, style=ProgressStyle(description_wi…
2019-10-14 21:57:10 INFO AutoFeaturizer: Featurizing with OxidationStates.
HBox(children=(IntProgress(value=0, description='OxidationStates', max=921, style=ProgressStyle(description_wi…
2019-10-14 21:57:10 INFO AutoFeaturizer: Featurizing with ElectronAffinity.
HBox(children=(IntProgress(value=0, description='ElectronAffinity', max=921, style=ProgressStyle(description_w…
2019-10-14 21:57:10 INFO AutoFeaturizer: Featurizing with IonProperty.
HBox(children=(IntProgress(value=0, description='IonProperty', max=921, style=ProgressStyle(description_width=…
2019-10-14 21:58:45 INFO AutoFeaturizer: Featurizer type structure not in the dataframe. Skipping... 2019-10-14 21:58:45 INFO AutoFeaturizer: Featurizer type bandstructure not in the dataframe. Skipping... 2019-10-14 21:58:45 INFO AutoFeaturizer: Featurizer type dos not in the dataframe. Skipping... 2019-10-14 21:58:45 INFO AutoFeaturizer: Finished transforming. 2019-10-14 21:58:45 INFO DataCleaner: Starting transforming. 2019-10-14 21:58:45 INFO DataCleaner: Cleaning with respect to samples with sample na_method 'fill' 2019-10-14 21:58:45 INFO DataCleaner: Replacing infinite values with nan for easier screening. 2019-10-14 21:58:45 INFO DataCleaner: Before handling na: 921 samples, 140 features 2019-10-14 21:58:45 WARNING DataCleaner: Mismatched columns found in dataframe used for fitting and argument dataframe. 2019-10-14 21:58:45 WARNING DataCleaner: Coercing mismatched columns... 2019-10-14 21:58:45 WARNING DataCleaner: Following columns are being dropped: ['minimum oxidation state', 'maximum oxidation state', 'range oxidation state', 'std_dev oxidation state', 'avg anion electron affinity', 'compound possible', 'max ionic char', 'avg ionic char'] 2019-10-14 21:58:45 INFO DataCleaner: After handling na: 921 samples, 132 features 2019-10-14 21:58:45 INFO DataCleaner: Target not found in df columns. Ignoring... 2019-10-14 21:58:45 INFO DataCleaner: Finished transforming. 2019-10-14 21:58:45 INFO FeatureReducer: Starting transforming. 2019-10-14 21:58:45 WARNING FeatureReducer: Target not found in columns to transform. 2019-10-14 21:58:45 INFO FeatureReducer: Finished transforming. 2019-10-14 21:58:45 INFO TPOTAdaptor: Starting predicting. 2019-10-14 21:58:46 INFO TPOTAdaptor: Prediction finished successfully. 2019-10-14 21:58:46 INFO TPOTAdaptor: Finished predicting. 2019-10-14 21:58:46 INFO MatPipe prediction completed.
MatPipe
places the predictions a column called "{target} predicted"
:
prediction_df.head()
MagpieData maximum Number | MagpieData minimum MendeleevNumber | MagpieData range MendeleevNumber | MagpieData avg_dev MendeleevNumber | MagpieData avg_dev AtomicWeight | MagpieData maximum MeltingT | MagpieData range MeltingT | MagpieData mean MeltingT | MagpieData avg_dev MeltingT | MagpieData mean Column | ... | MagpieData mean GSvolume_pa | MagpieData avg_dev GSvolume_pa | MagpieData mode GSvolume_pa | MagpieData maximum GSbandgap | MagpieData mean GSbandgap | MagpieData avg_dev GSbandgap | MagpieData avg_dev GSmagmom | MagpieData mean SpaceGroupNumber | MagpieData avg_dev SpaceGroupNumber | gap expt predicted | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4514 | 51.0 | 69.0 | 16.0 | 8.000000 | 28.190000 | 903.78 | 211.10 | 798.230000 | 105.550000 | 13.500000 | ... | 22.760000 | 8.800000 | 13.960000 | 0.000 | 0.000000 | 0.00000 | 0.000000 | 180.000000 | 14.000000 | 0.92970 |
834 | 52.0 | 58.0 | 32.0 | 14.506173 | 31.127892 | 1768.00 | 1045.34 | 1085.625278 | 473.871335 | 13.569444 | ... | 26.250023 | 11.114599 | 34.763333 | 0.464 | 0.302889 | 0.21034 | 0.701950 | 166.583333 | 19.039352 | 0.84045 |
4481 | 30.0 | 61.0 | 26.0 | 12.187500 | 21.802408 | 1728.00 | 1673.20 | 735.406667 | 744.445000 | 13.416667 | ... | 9.965208 | 0.931892 | 9.105000 | 0.000 | 0.000000 | 0.00000 | 0.279091 | 107.041667 | 102.961806 | 0.84395 |
3958 | 79.0 | 43.0 | 30.0 | 9.500000 | 79.771150 | 1941.00 | 1007.53 | 1387.282500 | 276.858750 | 9.750000 | ... | 16.642500 | 0.081250 | 16.700000 | 0.000 | 0.000000 | 0.00000 | 0.000008 | 217.250000 | 11.625000 | 1.65345 |
3087 | 59.0 | 17.0 | 61.0 | 18.080000 | 31.806681 | 1687.00 | 483.00 | 1523.200000 | 131.040000 | 9.000000 | ... | 19.506034 | 7.214759 | 10.487586 | 0.773 | 0.309200 | 0.37104 | 0.000149 | 216.400000 | 8.960000 | 0.00000 |
5 rows × 49 columns
Now let's score our predictions using mean average error, and compare them to a Dummy Regressor from sklearn.
from sklearn.metrics import mean_absolute_error
from sklearn.dummy import DummyRegressor
# fit the dummy
dr = DummyRegressor()
dr.fit(train_df["composition"], train_df[target])
dummy_test = dr.predict(test_df["composition"])
# Score dummy and MatPipe
true = test_df[target]
matpipe_test = prediction_df[target + " predicted"]
mae_matpipe = mean_absolute_error(true, matpipe_test)
mae_dummy = mean_absolute_error(true, dummy_test)
print("Dummy MAE: {} eV".format(mae_dummy))
print("MatPipe MAE: {} eV".format(mae_matpipe))
Dummy MAE: 1.1256546688824407 eV MatPipe MAE: 0.4591923995656894 eV
Inspect MatPipe
internals with a dict/text digest from either MatPipe.inspect
(long, comprehensive version of all proper attriute names) or MatPipe.summarize
(executive summary).
import pprint
# Get a summary and save a copy to json
summary = pipe.summarize(filename="MatPipe_predict_experimental_gap_from_composition_summary.json")
pprint.pprint(summary)
{'data_cleaning': {'drop_na_targets': 'True', 'encoder': 'one-hot', 'feature_na_method': 'drop', 'na_method_fit': 'drop', 'na_method_transform': 'fill'}, 'feature_reduction': {'reducer_params': "{'tree': {'importance_percentile': " "0.99, 'mode': 'regression', " "'random_state': 0}}", 'reducers': "('corr', 'tree')"}, 'features': ['MagpieData maximum Number', 'MagpieData minimum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData avg_dev AtomicWeight', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData avg_dev NsValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData maximum NdValence', 'MagpieData range NdValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData minimum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber'], 'featurizers': {'bandstructure': [BandFeaturizer(find_method='nearest', kpoints=None, nbands=2), BranchPointEnergy(atol=1e-05, calculate_band_edges=True, n_cb=1, n_vb=1)], 'composition': [ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x1280f9710>, features=['Number', 'MendeleevNumber', 'AtomicWeight', 'MeltingT', 'Column', 'Row', 'CovalentRadius', 'Electronegativity', 'NsValence', 'NpValence', 'NdValence', 'NfValence', 'NValence', 'NsUnfilled', 'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled', 'GSvolume_pa', 'GSbandgap', 'GSmagmom', 'SpaceGroupNumber'], stats=['minimum', 'maximum', 'range', 'mean', 'avg_dev', 'mode']), OxidationStates(stats=['minimum', 'maximum', 'range', 'std_dev']), ElectronAffinity(), IonProperty(data_source=<matminer.utils.data.PymatgenData object at 0x122c17710>, fast=False)], 'dos': [DOSFeaturizer(contributors=1, decay_length=0.1, gaussian_smear=0.05, sampling_resolution=100), DopingFermi(T=300, dopings=[-1e+20, 1e+20], eref='midgap', return_eref=False), Hybridization(decay_length=0.1, gaussian_smear=0.05, sampling_resolution=100, species=[]), DosAsymmetry(decay_length=0.5, gaussian_smear=0.05, sampling_resolution=100)], 'structure': [DensityFeatures(desired_features=None), GlobalSymmetryFeatures(desired_features=None), EwaldEnergy(accuracy=4), SineCoulombMatrix(diag_elems=True, flatten=True), GlobalInstabilityIndex(disordered_pymatgen=False, r_cut=4.0), StructuralComplexity(symprec=0.1)]}, 'ml_model': 'Pipeline(memory=Memory(location=/var/folders/4z/3vrw2wq10kzfh29c4x35qk3m0000gp/T/tmp79ge0rli/joblib),\n' " steps=[('variancethreshold', " 'VarianceThreshold(threshold=0.2)),\n' " ('normalizer', Normalizer(copy=True, " "norm='max')),\n" " ('randomforestregressor',\n" ' RandomForestRegressor(bootstrap=False, ' "criterion='mse',\n" ' max_depth=None,\n' ' ' 'max_features=0.7500000000000002,\n' ' max_leaf_nodes=None,\n' ' ' 'min_impurity_decrease=0.0,\n' ' min_impurity_split=None,\n' ' min_samples_leaf=1, ' 'min_samples_split=2,\n' ' ' 'min_weight_fraction_leaf=0.0,\n' ' n_estimators=200, ' 'n_jobs=None,\n' ' oob_score=False, ' 'random_state=None,\n' ' verbose=0, ' 'warm_start=False))],\n' ' verbose=False)'}
# Explain the MatPipe's internals more comprehensively
details = pipe.inspect(filename="MatPipe_predict_experimental_gap_from_composition_details.json")
print(details)
{'autofeaturizer': {'autofeaturizer': {'cache_src': None, 'preset': 'express', '_logger': <Logger automatminer (INFO)>, 'featurizers': {'composition': [ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x1280f9710>, features=['Number', 'MendeleevNumber', 'AtomicWeight', 'MeltingT', 'Column', 'Row', 'CovalentRadius', 'Electronegativity', 'NsValence', 'NpValence', 'NdValence', 'NfValence', 'NValence', 'NsUnfilled', 'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled', 'GSvolume_pa', 'GSbandgap', 'GSmagmom', 'SpaceGroupNumber'], stats=['minimum', 'maximum', 'range', 'mean', 'avg_dev', 'mode']), OxidationStates(stats=['minimum', 'maximum', 'range', 'std_dev']), ElectronAffinity(), IonProperty(data_source=<matminer.utils.data.PymatgenData object at 0x122c17710>, fast=False)], 'structure': [DensityFeatures(desired_features=None), GlobalSymmetryFeatures(desired_features=None), EwaldEnergy(accuracy=4), SineCoulombMatrix(diag_elems=True, flatten=True), GlobalInstabilityIndex(disordered_pymatgen=False, r_cut=4.0), StructuralComplexity(symprec=0.1)], 'bandstructure': [BandFeaturizer(find_method='nearest', kpoints=None, nbands=2), BranchPointEnergy(atol=1e-05, calculate_band_edges=True, n_cb=1, n_vb=1)], 'dos': [DOSFeaturizer(contributors=1, decay_length=0.1, gaussian_smear=0.05, sampling_resolution=100), DopingFermi(T=300, dopings=[-1e+20, 1e+20], eref='midgap', return_eref=False), Hybridization(decay_length=0.1, gaussian_smear=0.05, sampling_resolution=100, species=[]), DosAsymmetry(decay_length=0.5, gaussian_smear=0.05, sampling_resolution=100)]}, 'exclude': [], 'functionalize': False, 'ignore_cols': [], 'fitted_input_df': {'obj': <class 'pandas.core.frame.DataFrame'>, 'columns': 2, 'samples': 3683}, 'converted_input_df': {'obj': <class 'pandas.core.frame.DataFrame'>, 'columns': 2, 'samples': 3683}, 'ignore_errors': True, 'drop_inputs': True, 'multiindex': False, 'do_precheck': True, 'n_jobs': None, 'guess_oxistates': True, 'features': ['MagpieData minimum Number', 'MagpieData maximum Number', 'MagpieData range Number', 'MagpieData mean Number', 'MagpieData avg_dev Number', 'MagpieData mode Number', 'MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData minimum AtomicWeight', 'MagpieData maximum AtomicWeight', 'MagpieData range AtomicWeight', 'MagpieData mean AtomicWeight', 'MagpieData avg_dev AtomicWeight', 'MagpieData mode AtomicWeight', 'MagpieData minimum MeltingT', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData minimum Column', 'MagpieData maximum Column', 'MagpieData range Column', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mode Column', 'MagpieData minimum Row', 'MagpieData maximum Row', 'MagpieData range Row', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData mode Row', 'MagpieData minimum CovalentRadius', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData maximum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData minimum NsValence', 'MagpieData maximum NsValence', 'MagpieData range NsValence', 'MagpieData mean NsValence', 'MagpieData avg_dev NsValence', 'MagpieData mode NsValence', 'MagpieData minimum NpValence', 'MagpieData maximum NpValence', 'MagpieData range NpValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mode NpValence', 'MagpieData minimum NdValence', 'MagpieData maximum NdValence', 'MagpieData range NdValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData mode NdValence', 'MagpieData minimum NfValence', 'MagpieData maximum NfValence', 'MagpieData range NfValence', 'MagpieData mean NfValence', 'MagpieData avg_dev NfValence', 'MagpieData mode NfValence', 'MagpieData minimum NValence', 'MagpieData maximum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mode NValence', 'MagpieData minimum NsUnfilled', 'MagpieData maximum NsUnfilled', 'MagpieData range NsUnfilled', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NsUnfilled', 'MagpieData mode NsUnfilled', 'MagpieData minimum NpUnfilled', 'MagpieData maximum NpUnfilled', 'MagpieData range NpUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mode NpUnfilled', 'MagpieData minimum NdUnfilled', 'MagpieData maximum NdUnfilled', 'MagpieData range NdUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mode NdUnfilled', 'MagpieData minimum NfUnfilled', 'MagpieData maximum NfUnfilled', 'MagpieData range NfUnfilled', 'MagpieData mean NfUnfilled', 'MagpieData avg_dev NfUnfilled', 'MagpieData mode NfUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData maximum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData minimum GSbandgap', 'MagpieData maximum GSbandgap', 'MagpieData range GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mode GSbandgap', 'MagpieData minimum GSmagmom', 'MagpieData maximum GSmagmom', 'MagpieData range GSmagmom', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mode GSmagmom', 'MagpieData minimum SpaceGroupNumber', 'MagpieData maximum SpaceGroupNumber', 'MagpieData range SpaceGroupNumber', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'MagpieData mode SpaceGroupNumber', 'minimum oxidation state', 'maximum oxidation state', 'range oxidation state', 'std_dev oxidation state', 'avg anion electron affinity', 'compound possible', 'max ionic char', 'avg ionic char'], 'auto_featurizer': True, 'removed_featurizers': [YangSolidSolution(), Miedema(data_source='Miedema', ss_types=['min'], struct_types=['inter', 'amor', 'ss'])], 'composition_col': 'composition', 'structure_col': 'structure', 'bandstruct_col': 'bandstructure', 'dos_col': 'dos', 'is_fit': True, 'fittable_fcls': {'PartialRadialDistributionFunction', 'BagofBonds', 'BondFractions'}, 'needs_fit': False, 'min_precheck_frac': 0.9}}, 'cleaner': {'cleaner': {'_logger': <Logger automatminer (INFO)>, 'max_na_frac': 0.01, 'feature_na_method': 'drop', 'encoder': 'one-hot', 'encode_categories': True, 'drop_na_targets': True, 'na_method_fit': 'drop', 'na_method_transform': 'fill', 'dropped_features': ['avg ionic char', 'std_dev oxidation state', 'maximum oxidation state', 'avg anion electron affinity', 'range oxidation state', 'minimum oxidation state', 'max ionic char', 'compound possible'], 'object_cols': [], 'number_cols': ['MagpieData minimum Number', 'MagpieData maximum Number', 'MagpieData range Number', 'MagpieData mean Number', 'MagpieData avg_dev Number', 'MagpieData mode Number', 'MagpieData minimum MendeleevNumber', 'MagpieData maximum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData minimum AtomicWeight', 'MagpieData maximum AtomicWeight', 'MagpieData range AtomicWeight', 'MagpieData mean AtomicWeight', 'MagpieData avg_dev AtomicWeight', 'MagpieData mode AtomicWeight', 'MagpieData minimum MeltingT', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mode MeltingT', 'MagpieData minimum Column', 'MagpieData maximum Column', 'MagpieData range Column', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mode Column', 'MagpieData minimum Row', 'MagpieData maximum Row', 'MagpieData range Row', 'MagpieData mean Row', 'MagpieData avg_dev Row', 'MagpieData mode Row', 'MagpieData minimum CovalentRadius', 'MagpieData maximum CovalentRadius', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData maximum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData minimum NsValence', 'MagpieData maximum NsValence', 'MagpieData range NsValence', 'MagpieData mean NsValence', 'MagpieData avg_dev NsValence', 'MagpieData mode NsValence', 'MagpieData minimum NpValence', 'MagpieData maximum NpValence', 'MagpieData range NpValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData mode NpValence', 'MagpieData minimum NdValence', 'MagpieData maximum NdValence', 'MagpieData range NdValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData mode NdValence', 'MagpieData minimum NfValence', 'MagpieData maximum NfValence', 'MagpieData range NfValence', 'MagpieData mean NfValence', 'MagpieData avg_dev NfValence', 'MagpieData mode NfValence', 'MagpieData minimum NValence', 'MagpieData maximum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mode NValence', 'MagpieData minimum NsUnfilled', 'MagpieData maximum NsUnfilled', 'MagpieData range NsUnfilled', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NsUnfilled', 'MagpieData mode NsUnfilled', 'MagpieData minimum NpUnfilled', 'MagpieData maximum NpUnfilled', 'MagpieData range NpUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData mode NpUnfilled', 'MagpieData minimum NdUnfilled', 'MagpieData maximum NdUnfilled', 'MagpieData range NdUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mode NdUnfilled', 'MagpieData minimum NfUnfilled', 'MagpieData maximum NfUnfilled', 'MagpieData range NfUnfilled', 'MagpieData mean NfUnfilled', 'MagpieData avg_dev NfUnfilled', 'MagpieData mode NfUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData range NUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData maximum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData minimum GSbandgap', 'MagpieData maximum GSbandgap', 'MagpieData range GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData mode GSbandgap', 'MagpieData minimum GSmagmom', 'MagpieData maximum GSmagmom', 'MagpieData range GSmagmom', 'MagpieData mean GSmagmom', 'MagpieData avg_dev GSmagmom', 'MagpieData mode GSmagmom', 'MagpieData minimum SpaceGroupNumber', 'MagpieData maximum SpaceGroupNumber', 'MagpieData range SpaceGroupNumber', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber', 'MagpieData mode SpaceGroupNumber', 'minimum oxidation state', 'maximum oxidation state', 'range oxidation state', 'std_dev oxidation state', 'avg anion electron affinity', 'compound possible', 'max ionic char', 'avg ionic char'], 'fitted_df': {'obj': <class 'pandas.core.frame.DataFrame'>, 'columns': 133, 'samples': 3683}, 'fitted_target': 'gap expt', 'dropped_samples': {'obj': <class 'pandas.core.frame.DataFrame'>, 'columns': 141, 'samples': 0}, 'max_problem_col_warning_threshold': 0.3, 'warnings': [], 'is_fit': True}}, 'reducer': {'reducer': {'reducers': ('corr', 'tree'), 'corr_threshold': 0.95, 'n_pca_features': 'auto', 'tree_importance_percentile': 0.99, 'n_rebate_features': 0.3, '_logger': <Logger automatminer (INFO)>, '_keep_features': [], '_remove_features': [], 'removed_features': {'corr': ['MagpieData minimum Number', 'MagpieData mean Number', 'MagpieData avg_dev Number', 'MagpieData mode Number', 'MagpieData minimum AtomicWeight', 'MagpieData maximum AtomicWeight', 'MagpieData range AtomicWeight', 'MagpieData mean AtomicWeight', 'MagpieData mode AtomicWeight', 'MagpieData range NsValence', 'MagpieData range NfValence', 'MagpieData maximum NsUnfilled', 'MagpieData range NdUnfilled', 'MagpieData range NfUnfilled', 'MagpieData maximum GSvolume_pa', 'MagpieData range GSbandgap', 'MagpieData range GSmagmom', 'MagpieData range SpaceGroupNumber'], 'tree': ['MagpieData range Number', 'MagpieData maximum MendeleevNumber', 'MagpieData mean MendeleevNumber', 'MagpieData mode MendeleevNumber', 'MagpieData minimum MeltingT', 'MagpieData mode MeltingT', 'MagpieData minimum Column', 'MagpieData maximum Column', 'MagpieData range Column', 'MagpieData mode Column', 'MagpieData minimum Row', 'MagpieData maximum Row', 'MagpieData range Row', 'MagpieData avg_dev Row', 'MagpieData mode Row', 'MagpieData minimum CovalentRadius', 'MagpieData maximum CovalentRadius', 'MagpieData mode CovalentRadius', 'MagpieData maximum Electronegativity', 'MagpieData minimum NsValence', 'MagpieData maximum NsValence', 'MagpieData mean NsValence', 'MagpieData mode NsValence', 'MagpieData minimum NpValence', 'MagpieData maximum NpValence', 'MagpieData range NpValence', 'MagpieData mode NpValence', 'MagpieData minimum NdValence', 'MagpieData mode NdValence', 'MagpieData minimum NfValence', 'MagpieData maximum NfValence', 'MagpieData mean NfValence', 'MagpieData avg_dev NfValence', 'MagpieData mode NfValence', 'MagpieData maximum NValence', 'MagpieData mode NValence', 'MagpieData minimum NsUnfilled', 'MagpieData range NsUnfilled', 'MagpieData avg_dev NsUnfilled', 'MagpieData mode NsUnfilled', 'MagpieData minimum NpUnfilled', 'MagpieData maximum NpUnfilled', 'MagpieData range NpUnfilled', 'MagpieData mode NpUnfilled', 'MagpieData minimum NdUnfilled', 'MagpieData maximum NdUnfilled', 'MagpieData mean NdUnfilled', 'MagpieData mode NdUnfilled', 'MagpieData minimum NfUnfilled', 'MagpieData maximum NfUnfilled', 'MagpieData mean NfUnfilled', 'MagpieData avg_dev NfUnfilled', 'MagpieData mode NfUnfilled', 'MagpieData minimum NUnfilled', 'MagpieData maximum NUnfilled', 'MagpieData range NUnfilled', 'MagpieData mode NUnfilled', 'MagpieData minimum GSbandgap', 'MagpieData mode GSbandgap', 'MagpieData minimum GSmagmom', 'MagpieData maximum GSmagmom', 'MagpieData mean GSmagmom', 'MagpieData mode GSmagmom', 'MagpieData minimum SpaceGroupNumber', 'MagpieData maximum SpaceGroupNumber', 'MagpieData mode SpaceGroupNumber']}, 'retained_features': ['MagpieData range NValence', 'MagpieData mean GSbandgap', 'MagpieData avg_dev MeltingT', 'MagpieData range CovalentRadius', 'MagpieData maximum Number', 'MagpieData range GSvolume_pa', 'MagpieData mean CovalentRadius', 'MagpieData mean NValence', 'MagpieData avg_dev NdValence', 'MagpieData avg_dev CovalentRadius', 'MagpieData mean NsUnfilled', 'MagpieData avg_dev NsValence', 'MagpieData avg_dev GSmagmom', 'MagpieData avg_dev NpValence', 'MagpieData mean NdValence', 'MagpieData mean NUnfilled', 'MagpieData avg_dev GSbandgap', 'MagpieData mode Electronegativity', 'MagpieData range Electronegativity', 'MagpieData range MeltingT', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData avg_dev AtomicWeight', 'MagpieData mean Column', 'MagpieData avg_dev SpaceGroupNumber', 'MagpieData avg_dev Column', 'MagpieData avg_dev NValence', 'MagpieData maximum GSbandgap', 'MagpieData range MendeleevNumber', 'MagpieData avg_dev NUnfilled', 'MagpieData maximum NdValence', 'MagpieData avg_dev MendeleevNumber', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev Electronegativity', 'MagpieData range NdValence', 'MagpieData mean Row', 'MagpieData avg_dev NpUnfilled', 'MagpieData mode GSvolume_pa', 'MagpieData minimum GSvolume_pa', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean MeltingT', 'MagpieData minimum MendeleevNumber', 'MagpieData minimum Electronegativity', 'MagpieData maximum MeltingT', 'MagpieData mean Electronegativity', 'MagpieData minimum NValence', 'MagpieData mean NpValence', 'MagpieData mean SpaceGroupNumber'], 'reducer_params': {'tree': {'importance_percentile': 0.99, 'mode': 'regression', 'random_state': 0}}, '_pca': None, '_pca_feats': None, 'is_fit': True}}, 'learner': {'learner': {'mode': 'regression', 'tpot_kwargs': {'max_time_mins': 60, 'population_size': 20, 'cv': 5, 'n_jobs': -1, 'verbosity': 3, 'memory': 'auto', 'template': 'Selector-Transformer-Regressor', 'config_dict': {'sklearn.linear_model.ElasticNetCV': {'l1_ratio': array([0. , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1. ]), 'tol': [1e-05, 0.0001, 0.001, 0.01, 0.1]}, 'sklearn.ensemble.ExtraTreesRegressor': {'n_estimators': [20, 100, 200, 500, 1000], 'max_features': array([0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95]), 'min_samples_split': range(2, 21, 3), 'min_samples_leaf': range(1, 21, 3), 'bootstrap': [True, False]}, 'sklearn.ensemble.GradientBoostingRegressor': {'n_estimators': [20, 100, 200, 500, 1000], 'loss': ['ls', 'lad', 'huber', 'quantile'], 'learning_rate': [0.01, 0.1, 0.5, 1.0], 'max_depth': range(1, 11, 2), 'min_samples_split': range(2, 21, 3), 'min_samples_leaf': range(1, 21, 3), 'subsample': array([0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1. ]), 'max_features': array([0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1. ]), 'alpha': [0.75, 0.8, 0.85, 0.9, 0.95, 0.99]}, 'sklearn.tree.DecisionTreeRegressor': {'max_depth': range(1, 11, 2), 'min_samples_split': range(2, 21, 3), 'min_samples_leaf': range(1, 21, 3)}, 'sklearn.neighbors.KNeighborsRegressor': {'n_neighbors': range(1, 101), 'weights': ['uniform', 'distance'], 'p': [1, 2]}, 'sklearn.linear_model.LassoLarsCV': {'normalize': [True, False]}, 'sklearn.svm.LinearSVR': {'loss': ['epsilon_insensitive', 'squared_epsilon_insensitive'], 'dual': [True, False], 'tol': [1e-05, 0.0001, 0.001, 0.01, 0.1], 'C': [0.0001, 0.001, 0.01, 0.1, 0.5, 1.0, 5.0, 10.0, 15.0, 20.0, 25.0], 'epsilon': [0.0001, 0.001, 0.01, 0.1, 1.0]}, 'sklearn.ensemble.RandomForestRegressor': {'n_estimators': [20, 100, 200, 500, 1000], 'max_features': array([0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95]), 'min_samples_split': range(2, 21, 3), 'min_samples_leaf': range(1, 21, 3), 'bootstrap': [True, False]}, 'sklearn.linear_model.RidgeCV': {}, 'xgboost.XGBRegressor': {'n_estimators': [20, 100, 200, 500, 1000], 'max_depth': range(1, 11, 2), 'learning_rate': [0.01, 0.1, 0.5, 1.0], 'subsample': array([0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95]), 'min_child_weight': range(1, 21, 4), 'nthread': [1]}, 'sklearn.preprocessing.Binarizer': {'threshold': array([0. , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1. ])}, 'sklearn.decomposition.FastICA': {'tol': array([0. , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1. ])}, 'sklearn.cluster.FeatureAgglomeration': {'linkage': ['ward', 'complete', 'average'], 'affinity': ['euclidean', 'l1', 'l2', 'manhattan', 'cosine']}, 'sklearn.preprocessing.MaxAbsScaler': {}, 'sklearn.preprocessing.MinMaxScaler': {}, 'sklearn.preprocessing.Normalizer': {'norm': ['l1', 'l2', 'max']}, 'sklearn.kernel_approximation.Nystroem': {'kernel': ['rbf', 'cosine', 'chi2', 'laplacian', 'polynomial', 'poly', 'linear', 'additive_chi2', 'sigmoid'], 'gamma': array([0. , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1. ]), 'n_components': range(1, 11)}, 'sklearn.decomposition.PCA': {'svd_solver': ['randomized'], 'iterated_power': range(1, 11)}, 'sklearn.preprocessing.PolynomialFeatures': {'degree': [2], 'include_bias': [False], 'interaction_only': [False]}, 'sklearn.kernel_approximation.RBFSampler': {'gamma': array([0. , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1. ])}, 'sklearn.preprocessing.RobustScaler': {}, 'sklearn.preprocessing.StandardScaler': {}, 'tpot.builtins.ZeroCount': {}, 'tpot.builtins.OneHotEncoder': {'minimum_fraction': [0.05, 0.1, 0.15, 0.2, 0.25], 'sparse': [False], 'threshold': [10]}, 'sklearn.feature_selection.SelectFwe': {'alpha': array([0. , 0.001, 0.002, 0.003, 0.004, 0.005, 0.006, 0.007, 0.008, 0.009, 0.01 , 0.011, 0.012, 0.013, 0.014, 0.015, 0.016, 0.017, 0.018, 0.019, 0.02 , 0.021, 0.022, 0.023, 0.024, 0.025, 0.026, 0.027, 0.028, 0.029, 0.03 , 0.031, 0.032, 0.033, 0.034, 0.035, 0.036, 0.037, 0.038, 0.039, 0.04 , 0.041, 0.042, 0.043, 0.044, 0.045, 0.046, 0.047, 0.048, 0.049]), 'score_func': {'sklearn.feature_selection.f_regression': None}}, 'sklearn.feature_selection.SelectPercentile': {'percentile': range(1, 100), 'score_func': {'sklearn.feature_selection.f_regression': None}}, 'sklearn.feature_selection.VarianceThreshold': {'threshold': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.2]}, 'sklearn.feature_selection.SelectFromModel': {'threshold': array([0. , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 , 0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95, 1. ]), 'estimator': {'sklearn.ensemble.ExtraTreesRegressor': {'n_estimators': [100], 'max_features': array([0.05, 0.15, 0.25, 0.35, 0.45, 0.55, 0.65, 0.75, 0.85, 0.95])}}}}, 'scoring': 'neg_mean_absolute_error'}, 'models': OrderedDict([('XGBRegressor', [{'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('XGBRegressor(MinMaxScaler(SelectPercentile(input_matrix, SelectPercentile__percentile=65)), XGBRegressor__learning_rate=0.01, XGBRegressor__max_depth=3, XGBRegressor__min_child_weight=13, XGBRegressor__n_estimators=200, XGBRegressor__nthread=1, XGBRegressor__subsample=0.35000000000000003)',), 'operator_count': 3, 'internal_cv_score': -0.569127814761169}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('XGBRegressor(StandardScaler(SelectFromModel(input_matrix, SelectFromModel__ExtraTreesRegressor__max_features=0.7500000000000002, SelectFromModel__ExtraTreesRegressor__n_estimators=100, SelectFromModel__threshold=0.05)), XGBRegressor__learning_rate=0.01, XGBRegressor__max_depth=7, XGBRegressor__min_child_weight=17, XGBRegressor__n_estimators=200, XGBRegressor__nthread=1, XGBRegressor__subsample=0.6500000000000001)',), 'operator_count': 3, 'internal_cv_score': -0.575087151522238}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('XGBRegressor(StandardScaler(SelectFromModel(input_matrix, SelectFromModel__ExtraTreesRegressor__max_features=0.7500000000000002, SelectFromModel__ExtraTreesRegressor__n_estimators=100, SelectFromModel__threshold=0.0)), XGBRegressor__learning_rate=0.01, XGBRegressor__max_depth=7, XGBRegressor__min_child_weight=17, XGBRegressor__n_estimators=200, XGBRegressor__nthread=1, XGBRegressor__subsample=0.6500000000000001)',), 'operator_count': 3, 'internal_cv_score': -0.6215060869696674}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('XGBRegressor(MinMaxScaler(SelectPercentile(input_matrix, SelectPercentile__percentile=65)), XGBRegressor__learning_rate=0.01, XGBRegressor__max_depth=3, XGBRegressor__min_child_weight=13, XGBRegressor__n_estimators=200, XGBRegressor__nthread=1, XGBRegressor__subsample=0.35000000000000003)',), 'operator_count': 3, 'internal_cv_score': -0.6669258792711521}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.6969706764396285}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('XGBRegressor(MinMaxScaler(SelectPercentile(input_matrix, SelectPercentile__percentile=65)), XGBRegressor__learning_rate=0.1, XGBRegressor__max_depth=3, XGBRegressor__min_child_weight=13, XGBRegressor__n_estimators=200, XGBRegressor__nthread=1, XGBRegressor__subsample=0.35000000000000003)',), 'operator_count': 3, 'internal_cv_score': -0.7385281412308007}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.7480645948173358}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('XGBRegressor(StandardScaler(SelectFromModel(input_matrix, SelectFromModel__ExtraTreesRegressor__max_features=0.7500000000000002, SelectFromModel__ExtraTreesRegressor__n_estimators=100, SelectFromModel__threshold=0.05)), XGBRegressor__learning_rate=0.1, XGBRegressor__max_depth=7, XGBRegressor__min_child_weight=17, XGBRegressor__n_estimators=200, XGBRegressor__nthread=1, XGBRegressor__subsample=0.6500000000000001)',), 'operator_count': 3, 'internal_cv_score': -0.8026821366120996}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('XGBRegressor(StandardScaler(SelectFromModel(input_matrix, SelectFromModel__ExtraTreesRegressor__max_features=0.7500000000000002, SelectFromModel__ExtraTreesRegressor__n_estimators=100, SelectFromModel__threshold=0.05)), XGBRegressor__learning_rate=0.01, XGBRegressor__max_depth=7, XGBRegressor__min_child_weight=17, XGBRegressor__n_estimators=200, XGBRegressor__nthread=1, XGBRegressor__subsample=0.6500000000000001)',), 'operator_count': 3, 'internal_cv_score': -0.8089374047850457}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -1.7837394544333396}]), ('ElasticNetCV', [{'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.8164626453233076}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('ElasticNetCV(RobustScaler(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.05)), ElasticNetCV__l1_ratio=1.0, ElasticNetCV__tol=0.001)',), 'operator_count': 3, 'internal_cv_score': -0.8164660633330266}]), ('ExtraTreesRegressor', [{'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('ExtraTreesRegressor(ZeroCount(SelectPercentile(input_matrix, SelectPercentile__percentile=32)), ExtraTreesRegressor__bootstrap=True, ExtraTreesRegressor__max_features=0.6500000000000001, ExtraTreesRegressor__min_samples_leaf=13, ExtraTreesRegressor__min_samples_split=17, ExtraTreesRegressor__n_estimators=200)',), 'operator_count': 3, 'internal_cv_score': -0.6001026044965307}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('ExtraTreesRegressor(ZeroCount(SelectPercentile(input_matrix, SelectPercentile__percentile=74)), ExtraTreesRegressor__bootstrap=True, ExtraTreesRegressor__max_features=0.6500000000000001, ExtraTreesRegressor__min_samples_leaf=13, ExtraTreesRegressor__min_samples_split=17, ExtraTreesRegressor__n_estimators=200)',), 'operator_count': 3, 'internal_cv_score': -0.6023846638211461}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 1, 'predecessor': ('ExtraTreesRegressor(StandardScaler(SelectPercentile(input_matrix, SelectPercentile__percentile=33)), ExtraTreesRegressor__bootstrap=False, ExtraTreesRegressor__max_features=0.6500000000000001, ExtraTreesRegressor__min_samples_leaf=13, ExtraTreesRegressor__min_samples_split=11, ExtraTreesRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.6045251072197617}, {'generation': 'INVALID', 'mutation_count': 0, 'crossover_count': 1, 'predecessor': ('ExtraTreesRegressor(StandardScaler(SelectPercentile(input_matrix, SelectPercentile__percentile=20)), ExtraTreesRegressor__bootstrap=False, ExtraTreesRegressor__max_features=0.6500000000000001, ExtraTreesRegressor__min_samples_leaf=13, ExtraTreesRegressor__min_samples_split=11, ExtraTreesRegressor__n_estimators=500)', 'RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=33), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)'), 'operator_count': 3, 'internal_cv_score': -0.6158514341896033}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.6375123860905978}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.6509010828842549}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('ExtraTreesRegressor(Nystroem(SelectPercentile(input_matrix, SelectPercentile__percentile=66), Nystroem__gamma=0.65, Nystroem__kernel=additive_chi2, Nystroem__n_components=8), ExtraTreesRegressor__bootstrap=True, ExtraTreesRegressor__max_features=0.45000000000000007, ExtraTreesRegressor__min_samples_leaf=19, ExtraTreesRegressor__min_samples_split=8, ExtraTreesRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.7833717352531547}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.8384927269928548}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('ExtraTreesRegressor(Nystroem(SelectPercentile(input_matrix, SelectPercentile__percentile=66), Nystroem__gamma=0.65, Nystroem__kernel=additive_chi2, Nystroem__n_components=8), ExtraTreesRegressor__bootstrap=True, ExtraTreesRegressor__max_features=0.45000000000000007, ExtraTreesRegressor__min_samples_leaf=19, ExtraTreesRegressor__min_samples_split=8, ExtraTreesRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.8576068260773978}]), ('RandomForestRegressor', [{'generation': 'INVALID', 'mutation_count': 9, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)', 'RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=83), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200)'), 'operator_count': 3, 'internal_cv_score': -0.455318603917173}, {'generation': 'INVALID', 'mutation_count': 7, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.45842830981210525}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.45912305078608934}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4594158770131556}, {'generation': 'INVALID', 'mutation_count': 8, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.05), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.45977194554893497}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4605703481726741}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)', 'RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)'), 'operator_count': 3, 'internal_cv_score': -0.46136176692378017}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=83), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.46176736844433963}, {'generation': 'INVALID', 'mutation_count': 7, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.46179249348121043}, {'generation': 'INVALID', 'mutation_count': 7, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.4620840399681434}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.4636897739587635}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4647406961609935}, {'generation': 'INVALID', 'mutation_count': 5, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=83), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.46527932953808043}, {'generation': 'INVALID', 'mutation_count': 5, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=83), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4653152674989677}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.46563285018582967}, {'generation': 'INVALID', 'mutation_count': 5, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=83), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4661545082738482}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.46727537793493007}, {'generation': 'INVALID', 'mutation_count': 8, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.468052936596661}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.46806860712494835}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)', 'RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=83), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)'), 'operator_count': 3, 'internal_cv_score': -0.4682111621364521}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4682698397218453}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4684578070615303}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4693576708232553}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4694256948336381}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4704077886260397}, {'generation': 'INVALID', 'mutation_count': 5, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)', 'RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)'), 'operator_count': 3, 'internal_cv_score': -0.4704287379948086}, {'generation': 'INVALID', 'mutation_count': 5, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.4706860033700077}, {'generation': 'INVALID', 'mutation_count': 9, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.4712993735104124}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.05), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.47190471902468883}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.47253502577281575}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=83), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.47256092221083323}, {'generation': 'INVALID', 'mutation_count': 7, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.4736661653552886}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.05), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200)',), 'operator_count': 3, 'internal_cv_score': -0.47395881953867036}, {'generation': 'INVALID', 'mutation_count': 10, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200)',), 'operator_count': 3, 'internal_cv_score': -0.4755418318978232}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.4767649465160435}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4770011241962126}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.4771079125862781}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.47736442567696874}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4775844756577783}, {'generation': 'INVALID', 'mutation_count': 7, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)', 'RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)'), 'operator_count': 3, 'internal_cv_score': -0.4788696819140464}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4790047978364108}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4790520868975282}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.4794093061234272}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4813523796531178}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200)',), 'operator_count': 3, 'internal_cv_score': -0.4814206338121645}, {'generation': 'INVALID', 'mutation_count': 9, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.48288021967730527}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4830894376464305}, {'generation': 'INVALID', 'mutation_count': 5, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.4848959183862999}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.45000000000000007, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.48573855595856125}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=83), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200)',), 'operator_count': 3, 'internal_cv_score': -0.486632824201072}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.48686893877942305}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.48808093210332115}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=36), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.4883693989166911}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.48964958977346473}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.4896931615833874}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.49068955754032634}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4916322394327768}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.05), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200)',), 'operator_count': 3, 'internal_cv_score': -0.4944770445819323}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.4949934013332545}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=11, RandomForestRegressor__n_estimators=20)', 'RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)'), 'operator_count': 3, 'internal_cv_score': -0.495442817414375}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.4965168745575482}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.49662437839581725}, {'generation': 'INVALID', 'mutation_count': 7, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.49785791904644255}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=83), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200)',), 'operator_count': 3, 'internal_cv_score': -0.498404746131628}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5006650431759189}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.501016326273348}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=7, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5015576419135714}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.45000000000000007, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5017029800970444}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5017239416184885}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.5019658700076691}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.45000000000000007, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.502132009247242}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.502173717074214}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.05), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200)',), 'operator_count': 3, 'internal_cv_score': -0.5028369660168946}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.503488141295253}, {'generation': 'INVALID', 'mutation_count': 5, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5043467494687167}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.506085745863076}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5066423203277093}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5082377148963679}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5108597894961291}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=11, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5113743494461277}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5138761448307937}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=11, RandomForestRegressor__n_estimators=20)', 'RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)'), 'operator_count': 3, 'internal_cv_score': -0.5140737071551884}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.5142631825324426}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=11, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5143240529651936}, {'generation': 'INVALID', 'mutation_count': 5, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.45000000000000007, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.5164723844402882}, {'generation': 'INVALID', 'mutation_count': 6, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=83), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=200)',), 'operator_count': 3, 'internal_cv_score': -0.5185720052718958}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5197089440712135}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=45), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5256653512083849}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=36), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=8, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5256704665329128}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=37), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=8, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5256704665329128}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=37), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=8, RandomForestRegressor__n_estimators=100)', 'RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=11, RandomForestRegressor__n_estimators=20)'), 'operator_count': 3, 'internal_cv_score': -0.5285409298914097}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=11, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5290078620113645}, {'generation': 'INVALID', 'mutation_count': 7, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.529713660310341}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=33), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5314490979883193}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5323142612284427}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=36), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5344570254494924}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=13, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5372115527408662}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.5382520161974218}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.539747793345718}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=33), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5419200265242919}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=33), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.6500000000000001, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5551007043567818}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=33), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5575491468660686}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=33), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5578647227402184}, {'generation': 'INVALID', 'mutation_count': 5, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=1000)',), 'operator_count': 3, 'internal_cv_score': -0.5580485680169603}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -0.558141826934892}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 2, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=13, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5596290082737609}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5625875480737101}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 2, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=13, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)', 'RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=32), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)'), 'operator_count': 3, 'internal_cv_score': -0.5625875480737101}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.05), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5645996661332664}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=47), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5650904927981569}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=33), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5659222817539376}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=37), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=8, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5660384787526581}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5664588621662168}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=89), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.56672074045975}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=36), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=8, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5668011133999042}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=36), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5669404523220268}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=13, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5672280982980369}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.5684071259693165}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=36), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)', 'ExtraTreesRegressor(ZeroCount(SelectPercentile(input_matrix, SelectPercentile__percentile=32)), ExtraTreesRegressor__bootstrap=True, ExtraTreesRegressor__max_features=0.6500000000000001, ExtraTreesRegressor__min_samples_leaf=13, ExtraTreesRegressor__min_samples_split=17, ExtraTreesRegressor__n_estimators=200)'), 'operator_count': 3, 'internal_cv_score': -0.5684071259693165}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=37), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=8, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5726874801214599}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5738818779775587}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=33), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5760918962300835}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=32), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5760918962300835}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=33), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5771892863113667}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5792735807780065}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.45000000000000007, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5841747490929737}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5867055144239277}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5867056463769564}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=13, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5876694192685129}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=33), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.5885524671574142}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=43), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5908714583830116}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.5994665910197039}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=36), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)', 'ExtraTreesRegressor(StandardScaler(SelectPercentile(input_matrix, SelectPercentile__percentile=20)), ExtraTreesRegressor__bootstrap=False, ExtraTreesRegressor__max_features=0.6500000000000001, ExtraTreesRegressor__min_samples_leaf=13, ExtraTreesRegressor__min_samples_split=11, ExtraTreesRegressor__n_estimators=500)'), 'operator_count': 3, 'internal_cv_score': -0.6038520140105461}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=20), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=20, RandomForestRegressor__n_estimators=100)',), 'operator_count': 3, 'internal_cv_score': -0.6085274872092868}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(FastICA(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.0001), FastICA__tol=0.65), RandomForestRegressor__bootstrap=True, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.6462788409484199}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(FastICA(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.0001), FastICA__tol=0.65), RandomForestRegressor__bootstrap=True, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.6545899218581008}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.6634546572350697}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(FastICA(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.0001), FastICA__tol=0.65), RandomForestRegressor__bootstrap=True, RandomForestRegressor__max_features=0.35000000000000003, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.6634546572350697}, {'generation': 'INVALID', 'mutation_count': 5, 'crossover_count': 1, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=71), Normalizer__norm=max), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=11, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.7734237644225525}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.7996890896172534}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Nystroem(SelectPercentile(input_matrix, SelectPercentile__percentile=14), Nystroem__gamma=0.15000000000000002, Nystroem__kernel=additive_chi2, Nystroem__n_components=6), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=10, RandomForestRegressor__min_samples_split=11, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.7996890896172534}, {'generation': 'INVALID', 'mutation_count': 2, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Nystroem(SelectPercentile(input_matrix, SelectPercentile__percentile=14), Nystroem__gamma=0.75, Nystroem__kernel=additive_chi2, Nystroem__n_components=6), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=10, RandomForestRegressor__min_samples_split=11, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -0.7996890896172534}, {'generation': 'INVALID', 'mutation_count': 4, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(SelectPercentile(input_matrix, SelectPercentile__percentile=96), Normalizer__norm=l1), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.25000000000000006, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=500)',), 'operator_count': 3, 'internal_cv_score': -inf}, {'generation': 'INVALID', 'mutation_count': 3, 'crossover_count': 0, 'predecessor': ('RandomForestRegressor(Normalizer(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), Normalizer__norm=l2), RandomForestRegressor__bootstrap=False, RandomForestRegressor__max_features=0.7500000000000002, RandomForestRegressor__min_samples_leaf=1, RandomForestRegressor__min_samples_split=2, RandomForestRegressor__n_estimators=20)',), 'operator_count': 3, 'internal_cv_score': -inf}]), ('GradientBoostingRegressor', [{'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('GradientBoostingRegressor(PolynomialFeatures(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.0001), PolynomialFeatures__degree=2, PolynomialFeatures__include_bias=False, PolynomialFeatures__interaction_only=False), GradientBoostingRegressor__alpha=0.75, GradientBoostingRegressor__learning_rate=0.01, GradientBoostingRegressor__loss=ls, GradientBoostingRegressor__max_depth=9, GradientBoostingRegressor__max_features=1.0, GradientBoostingRegressor__min_samples_leaf=1, GradientBoostingRegressor__min_samples_split=14, GradientBoostingRegressor__n_estimators=100, GradientBoostingRegressor__subsample=0.2)',), 'operator_count': 3, 'internal_cv_score': -0.7340601028290766}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.7398949682708912}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.9083310211639299}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('GradientBoostingRegressor(PCA(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.2), PCA__iterated_power=5, PCA__svd_solver=randomized), GradientBoostingRegressor__alpha=0.8, GradientBoostingRegressor__learning_rate=0.01, GradientBoostingRegressor__loss=lad, GradientBoostingRegressor__max_depth=9, GradientBoostingRegressor__max_features=0.45, GradientBoostingRegressor__min_samples_leaf=16, GradientBoostingRegressor__min_samples_split=20, GradientBoostingRegressor__n_estimators=20, GradientBoostingRegressor__subsample=0.45)',), 'operator_count': 3, 'internal_cv_score': -0.9087905235167509}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('GradientBoostingRegressor(FastICA(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.1), FastICA__tol=0.8), GradientBoostingRegressor__alpha=0.75, GradientBoostingRegressor__learning_rate=0.5, GradientBoostingRegressor__loss=quantile, GradientBoostingRegressor__max_depth=9, GradientBoostingRegressor__max_features=0.1, GradientBoostingRegressor__min_samples_leaf=1, GradientBoostingRegressor__min_samples_split=14, GradientBoostingRegressor__n_estimators=20, GradientBoostingRegressor__subsample=0.6000000000000001)',), 'operator_count': 3, 'internal_cv_score': -0.9559240000630987}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.9773121723225457}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -1.644452799130757}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('GradientBoostingRegressor(Nystroem(SelectPercentile(input_matrix, SelectPercentile__percentile=91), Nystroem__gamma=0.55, Nystroem__kernel=chi2, Nystroem__n_components=3), GradientBoostingRegressor__alpha=0.8, GradientBoostingRegressor__learning_rate=0.01, GradientBoostingRegressor__loss=quantile, GradientBoostingRegressor__max_depth=9, GradientBoostingRegressor__max_features=0.45, GradientBoostingRegressor__min_samples_leaf=4, GradientBoostingRegressor__min_samples_split=8, GradientBoostingRegressor__n_estimators=100, GradientBoostingRegressor__subsample=0.7500000000000001)',), 'operator_count': 3, 'internal_cv_score': -1.645012338359622}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('GradientBoostingRegressor(Nystroem(SelectPercentile(input_matrix, SelectPercentile__percentile=91), Nystroem__gamma=0.55, Nystroem__kernel=chi2, Nystroem__n_components=3), GradientBoostingRegressor__alpha=0.8, GradientBoostingRegressor__learning_rate=0.01, GradientBoostingRegressor__loss=quantile, GradientBoostingRegressor__max_depth=9, GradientBoostingRegressor__max_features=0.45, GradientBoostingRegressor__min_samples_leaf=4, GradientBoostingRegressor__min_samples_split=8, GradientBoostingRegressor__n_estimators=100, GradientBoostingRegressor__subsample=0.7500000000000001)',), 'operator_count': 3, 'internal_cv_score': -1.6458734739624976}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('GradientBoostingRegressor(PolynomialFeatures(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.0001), PolynomialFeatures__degree=2, PolynomialFeatures__include_bias=False, PolynomialFeatures__interaction_only=False), GradientBoostingRegressor__alpha=0.75, GradientBoostingRegressor__learning_rate=0.01, GradientBoostingRegressor__loss=ls, GradientBoostingRegressor__max_depth=9, GradientBoostingRegressor__max_features=1.0, GradientBoostingRegressor__min_samples_leaf=1, GradientBoostingRegressor__min_samples_split=14, GradientBoostingRegressor__n_estimators=100, GradientBoostingRegressor__subsample=0.2)',), 'operator_count': 3, 'internal_cv_score': -inf}]), ('DecisionTreeRegressor', [{'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.6392929675471104}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('DecisionTreeRegressor(PolynomialFeatures(SelectPercentile(input_matrix, SelectPercentile__percentile=40), PolynomialFeatures__degree=2, PolynomialFeatures__include_bias=False, PolynomialFeatures__interaction_only=False), DecisionTreeRegressor__max_depth=5, DecisionTreeRegressor__min_samples_leaf=16, DecisionTreeRegressor__min_samples_split=17)',), 'operator_count': 3, 'internal_cv_score': -0.6596436463837325}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('DecisionTreeRegressor(RBFSampler(SelectPercentile(input_matrix, SelectPercentile__percentile=55), RBFSampler__gamma=0.8500000000000001), DecisionTreeRegressor__max_depth=3, DecisionTreeRegressor__min_samples_leaf=4, DecisionTreeRegressor__min_samples_split=17)',), 'operator_count': 3, 'internal_cv_score': -1.1518206595613913}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('DecisionTreeRegressor(RBFSampler(SelectPercentile(input_matrix, SelectPercentile__percentile=55), RBFSampler__gamma=0.8500000000000001), DecisionTreeRegressor__max_depth=3, DecisionTreeRegressor__min_samples_leaf=4, DecisionTreeRegressor__min_samples_split=17)',), 'operator_count': 3, 'internal_cv_score': -1.1554775152101997}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -1.1571825434056886}]), ('KNeighborsRegressor', [{'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.6580176053988469}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('KNeighborsRegressor(MaxAbsScaler(SelectPercentile(input_matrix, SelectPercentile__percentile=97)), KNeighborsRegressor__n_neighbors=82, KNeighborsRegressor__p=1, KNeighborsRegressor__weights=uniform)',), 'operator_count': 3, 'internal_cv_score': -0.7266140656857427}]), ('LassoLarsCV', [{'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('LassoLarsCV(PCA(VarianceThreshold(input_matrix, VarianceThreshold__threshold=0.005), PCA__iterated_power=10, PCA__svd_solver=randomized), LassoLarsCV__normalize=False)',), 'operator_count': 3, 'internal_cv_score': -0.8109762291081436}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.8125928360909143}, {'generation': 0, 'mutation_count': 0, 'crossover_count': 0, 'predecessor': ('ROOT',), 'operator_count': 3, 'internal_cv_score': -0.9044340233523197}, {'generation': 'INVALID', 'mutation_count': 1, 'crossover_count': 0, 'predecessor': ('LassoLarsCV(Binarizer(SelectPercentile(input_matrix, SelectPercentile__percentile=17), Binarizer__threshold=1.0), LassoLarsCV__normalize=False)',), 'operator_count': 3, 'internal_cv_score': -0.9048426620159453}])]), 'random_state': None, 'greater_score_is_better': True, '_fitted_target': 'gap expt', '_backend': ["('variancethreshold', VarianceThreshold(threshold=0.2))", "('normalizer', Normalizer(copy=True, norm='max'))", "('randomforestregressor', RandomForestRegressor(bootstrap=False, criterion='mse', max_depth=None,\n max_features=0.7500000000000002, max_leaf_nodes=None,\n min_impurity_decrease=0.0, min_impurity_split=None,\n min_samples_leaf=1, min_samples_split=2,\n min_weight_fraction_leaf=0.0, n_estimators=200,\n n_jobs=None, oob_score=False, random_state=None,\n verbose=0, warm_start=False))"], '_features': ['MagpieData maximum Number', 'MagpieData minimum MendeleevNumber', 'MagpieData range MendeleevNumber', 'MagpieData avg_dev MendeleevNumber', 'MagpieData avg_dev AtomicWeight', 'MagpieData maximum MeltingT', 'MagpieData range MeltingT', 'MagpieData mean MeltingT', 'MagpieData avg_dev MeltingT', 'MagpieData mean Column', 'MagpieData avg_dev Column', 'MagpieData mean Row', 'MagpieData range CovalentRadius', 'MagpieData mean CovalentRadius', 'MagpieData avg_dev CovalentRadius', 'MagpieData minimum Electronegativity', 'MagpieData range Electronegativity', 'MagpieData mean Electronegativity', 'MagpieData avg_dev Electronegativity', 'MagpieData mode Electronegativity', 'MagpieData avg_dev NsValence', 'MagpieData mean NpValence', 'MagpieData avg_dev NpValence', 'MagpieData maximum NdValence', 'MagpieData range NdValence', 'MagpieData mean NdValence', 'MagpieData avg_dev NdValence', 'MagpieData minimum NValence', 'MagpieData range NValence', 'MagpieData mean NValence', 'MagpieData avg_dev NValence', 'MagpieData mean NsUnfilled', 'MagpieData mean NpUnfilled', 'MagpieData avg_dev NpUnfilled', 'MagpieData avg_dev NdUnfilled', 'MagpieData mean NUnfilled', 'MagpieData avg_dev NUnfilled', 'MagpieData minimum GSvolume_pa', 'MagpieData range GSvolume_pa', 'MagpieData mean GSvolume_pa', 'MagpieData avg_dev GSvolume_pa', 'MagpieData mode GSvolume_pa', 'MagpieData maximum GSbandgap', 'MagpieData mean GSbandgap', 'MagpieData avg_dev GSbandgap', 'MagpieData avg_dev GSmagmom', 'MagpieData mean SpaceGroupNumber', 'MagpieData avg_dev SpaceGroupNumber'], '_logger': <Logger automatminer (INFO)>, 'from_serialized': True, '_best_models': OrderedDict([('RandomForestRegressor', -0.455318603917173), ('XGBRegressor', -0.569127814761169), ('ExtraTreesRegressor', -0.6001026044965307), ('DecisionTreeRegressor', -0.6392929675471104), ('KNeighborsRegressor', -0.6580176053988469), ('GradientBoostingRegressor', -0.7340601028290766), ('LassoLarsCV', -0.8109762291081436), ('ElasticNetCV', -0.8164626453233076)]), 'is_fit': True}}, '_logger': <Logger automatminer (INFO)>, 'pre_fit_df': {'obj': <class 'pandas.core.frame.DataFrame'>, 'columns': 2, 'samples': 3683}, 'post_fit_df': {'obj': <class 'pandas.core.frame.DataFrame'>, 'columns': 49, 'samples': 3683}, 'ml_type': 'regression', 'target': 'gap expt', 'version': '2019.10.11', 'is_fit': True}
You can access MatPipe's internal objects directly, instead of via a text digest; you just need to know which attributes to access. See the online API docs or the source code for more info.
# Access some attributes of MatPipe directly, instead of via a text digest
print(pipe.learner.best_pipeline)
Pipeline(memory=Memory(location=/var/folders/4z/3vrw2wq10kzfh29c4x35qk3m0000gp/T/tmp79ge0rli/joblib), steps=[('variancethreshold', VarianceThreshold(threshold=0.2)), ('normalizer', Normalizer(copy=True, norm='max')), ('randomforestregressor', RandomForestRegressor(bootstrap=False, criterion='mse', max_depth=None, max_features=0.7500000000000002, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=200, n_jobs=None, oob_score=False, random_state=None, verbose=0, warm_start=False))], verbose=False)
print(pipe.autofeaturizer.featurizers["composition"])
[ElementProperty(data_source=<matminer.utils.data.MagpieData object at 0x1280f9710>, features=['Number', 'MendeleevNumber', 'AtomicWeight', 'MeltingT', 'Column', 'Row', 'CovalentRadius', 'Electronegativity', 'NsValence', 'NpValence', 'NdValence', 'NfValence', 'NValence', 'NsUnfilled', 'NpUnfilled', 'NdUnfilled', 'NfUnfilled', 'NUnfilled', 'GSvolume_pa', 'GSbandgap', 'GSmagmom', 'SpaceGroupNumber'], stats=['minimum', 'maximum', 'range', 'mean', 'avg_dev', 'mode']), OxidationStates(stats=['minimum', 'maximum', 'range', 'std_dev']), ElectronAffinity(), IonProperty(data_source=<matminer.utils.data.PymatgenData object at 0x122c17710>, fast=False)]
# Save the pipeline for later
filename = "MatPipe_predict_experimental_gap_from_composition.p"
pipe.save(filename)
# Load your saved pipeline later, or on another machine
pipe_loaded = MatPipe.load(filename)
Congrats! You've made it through the basic Automatminer tutorial!
In this tutorial, you learned how to:
MatPipe
.MatPipe
pipeline.If you encountered any problems running this notebook, please open an issue on the repo or post an issue on our support forum.