authors: Alireza Faghaninia, Alex Dunn, Joseph Montoya, Daniel Dopp
This notebook was last updated 11/15/18 for version 0.4.5 of matminer.
Note that in order to get the in-line plotting to work, you might need to start Jupyter notebook with a higher data rate, e.g., jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10
. We recommend you do this before starting.
A Citrine API key is required to load the data for this notebook (can be found under account settings). Set the CITRINE_API
environment varible or add API key as an argument to CitrineDataRetrieval()
. (Reference data retrieval notebook)
This notebook illustrates a few more advanced examples of matminer's visualization features. Note that these examples and a few additional ones are included in script form in the matminer_examples repository.
import pprint
import pandas as pd
from pymatgen.core.composition import Composition
from figrecipes import PlotlyFig
from matminer.datasets import load_dataset
from matminer.data_retrieval.retrieve_Citrine import CitrineDataRetrieval
This example generates a scatter plot of the properties of thermoelectric materials based on the data available in http://www.mrl.ucsb.edu:8080/datamine/thermoelectric.jsp The data is extracted via Citrine data retrieval tools. The dataset id on Citrine is 150557
# GET DATA
# Note that your Citrine API key must be set as the CITRINE_API
# environment variable or as an argument to the CitrineDataRetrieval() constructor
cdr = CitrineDataRetrieval()
df_te = cdr.get_dataframe(criteria={'data_type': 'experimental', 'data_set_id': 150557},
properties=['Seebeck coefficient'], secondary_fields=True)
# CLEAN AND PRUNE DATA
# Convert numeric columns to numeric data types
numeric_cols = ['chemicalFormula', 'Electrical resistivity', 'Seebeck coefficient',
'Thermal conductivity', 'Thermoelectric figure of merit (zT)']
df_te = df_te[numeric_cols].apply(pd.to_numeric, errors='ignore')
# Filter data based on resistivities between 0.0005 and 0.1 and
# Seebeck coefficients less than 500 and simplify zT naming
df_te = df_te[(5e-4 < df_te['Electrical resistivity']) & (df_te['Electrical resistivity'] < 0.1)]
df_te = df_te[abs(df_te['Seebeck coefficient']) < 500]
df_te = df_te.rename(columns={'Thermoelectric figure of merit (zT)': 'zT'})
# GENERATE PLOTS
pf = PlotlyFig(df_te, x_scale='log', fontfamily='Times New Roman',
hovercolor='white', x_title='Electrical Resistivity (cm/S)',
y_title='Seebeck Coefficient (uV/K)',
colorbar_title='Thermal Conductivity (W/m.K)',
mode='notebook')
pf.xy((df_te['Electrical resistivity'], df_te['Seebeck coefficient']),
labels='chemicalFormula',
sizes='zT',
colors='Thermal conductivity',
color_range=[0, 5])