Advanced Visualization

authors: Alireza Faghaninia, Alex Dunn, Joseph Montoya, Daniel Dopp

This notebook was last updated 11/15/18 for version 0.4.5 of matminer.

Note that in order to get the in-line plotting to work, you might need to start Jupyter notebook with a higher data rate, e.g., jupyter notebook --NotebookApp.iopub_data_rate_limit=1.0e10. We recommend you do this before starting.

A Citrine API key is required to load the data for this notebook (can be found under account settings). Set the CITRINE_API environment varible or add API key as an argument to CitrineDataRetrieval(). (Reference data retrieval notebook)

This notebook illustrates a few more advanced examples of matminer's visualization features. Note that these examples and a few additional ones are included in script form in the matminer_examples repository.

In [1]:
import pprint

import pandas as pd
from pymatgen.core.composition import Composition
from figrecipes import PlotlyFig
from matminer.datasets import load_dataset
from matminer.data_retrieval.retrieve_Citrine import CitrineDataRetrieval

Plotting thermoelectric data

This example generates a scatter plot of the properties of thermoelectric materials based on the data available in The data is extracted via Citrine data retrieval tools. The dataset id on Citrine is 150557

In [2]:
# Note that your Citrine API key must be set as the CITRINE_API 
# environment variable or as an argument to the CitrineDataRetrieval() constructor
cdr = CitrineDataRetrieval()
df_te = cdr.get_dataframe(criteria={'data_type': 'experimental', 'data_set_id': 150557},
                          properties=['Seebeck coefficient'], secondary_fields=True)

# Convert numeric columns to numeric data types
numeric_cols = ['chemicalFormula', 'Electrical resistivity', 'Seebeck coefficient',
                'Thermal conductivity', 'Thermoelectric figure of merit (zT)']
df_te = df_te[numeric_cols].apply(pd.to_numeric, errors='ignore')

# Filter data based on resistivities between 0.0005 and 0.1 and
# Seebeck coefficients less than 500 and simplify zT naming
df_te = df_te[(5e-4 < df_te['Electrical resistivity']) & (df_te['Electrical resistivity'] < 0.1)]
df_te = df_te[abs(df_te['Seebeck coefficient']) < 500]
df_te = df_te.rename(columns={'Thermoelectric figure of merit (zT)': 'zT'})

pf = PlotlyFig(df_te, x_scale='log', fontfamily='Times New Roman',
               hovercolor='white', x_title='Electrical Resistivity (cm/S)',
               y_title='Seebeck Coefficient (uV/K)',
               colorbar_title='Thermal Conductivity (W/m.K)',

pf.xy((df_te['Electrical resistivity'], df_te['Seebeck coefficient']),
      colors='Thermal conductivity',
      color_range=[0, 5])
  0%|          | 0/1093 [00:00<?, ?it/s]/Users/ardunn/alex/lbl/projects/common_env/dev_codes/matminer/matminer/data_retrieval/ FutureWarning: is deprecated, use pandas.json_normalize instead
  system_normdf = json_normalize(system_value)
100%|██████████| 1093/1093 [01:03<00:00, 17.28it/s]
all available fields:
['Crystallinity', 'Electrical resistivity-units', 'Space group', 'Seebeck coefficient', 'Thermoelectric figure of merit (zT)-conditions', 'Electrical conductivity', 'Thermal conductivity-units', 'Electrical resistivity-conditions', 'Thermal conductivity-conditions', 'Electrical resistivity', 'uid', 'Power factor-dataType', 'Thermal conductivity', 'Preparation method', 'Seebeck coefficient-units', 'Electrical conductivity-conditions', 'references', 'Electrical resistivity-dataType', 'category', 'Power factor', 'Power factor-units', 'Thermoelectric figure of merit (zT)-dataType', 'Electrical conductivity-units', 'Seebeck coefficient-conditions', 'Electrical conductivity-dataType', 'Seebeck coefficient-dataType', 'Thermoelectric figure of merit (zT)', 'Power factor-conditions', 'Thermal conductivity-dataType', 'chemicalFormula']

suggested common fields:
['references', 'chemicalFormula', 'Crystallinity', 'Preparation method', 'Space group', 'Electrical resistivity', 'Electrical resistivity-units', 'Electrical resistivity-conditions', 'Electrical resistivity-dataType', 'Seebeck coefficient', 'Seebeck coefficient-units', 'Seebeck coefficient-conditions', 'Seebeck coefficient-dataType', 'Power factor', 'Power factor-units', 'Power factor-conditions', 'Power factor-dataType', 'Thermoelectric figure of merit (zT)', 'Thermoelectric figure of merit (zT)-conditions', 'Thermoelectric figure of merit (zT)-dataType', 'Thermal conductivity', 'Thermal conductivity-units', 'Thermal conductivity-conditions', 'Thermal conductivity-dataType', 'Electrical conductivity', 'Electrical conductivity-units', 'Electrical conductivity-conditions', 'Electrical conductivity-dataType']