Advanced MPDS API usage: pVT-data and EoS fitting

  • Complexity level: green karate belt
  • Requirements: familiarity with computational thermodynamics

Among the opened MPDS data there is a considerable number of the diagrams cell parameters vs. temperature VT, as well as cell parameters vs. pressure pV. They provide us the link to the thermodynamic properties. Let's explore this link.

We will calculate an isothermal bulk modulus, as well as its pressure derivative, using the pV-data from the MPDS API and fitting to the equation of state (EoS).

Important! Before you proceed: the notebooks running at the third-party servers are not secure. Using this notebook assumes you authenticate at the MPDS server with your own API key. Please run this notebook only if you have an open-access account (i.e. an access section of your MPDS account reads: Programmatic data access: only open data).

Please do not run this notebook at the third-party servers if you have an elevated API access to the MPDS, since there's a nonzero probability of key leakage!

Be sure to always invalidate (revoke) your API key at your MPDS account after using the notebooks.

Now let's proceed with the authentication part. First, apply for an MPDS account, if you have none. Then copy your API key, run the next cell, paste the key in the appeared prompt input, and hit Enter.

In [ ]:
import os, getpass
os.environ['MPDS_KEY'] = getpass.getpass()

OK, now you may talk to the MPDS server programmatically from this notebook on your behalf.

For fitting we will use a pytheos pure-Python library written by Prof. Sang-Heon Dan Shim. (Note there's a wide range of the other EoS fitting tools!) Since the original pytheos only supports PY3, we now install its fork, supporting both PY2 and PY3:

In [ ]:
!pip install git+https://github.com/mpds-io/pytheos.git#egg=pytheos

(NB. A SyntaxError: invalid syntax exception occurs in PY2 while byte-compiling because not all the pytheos PY3 code was backported to PY2, however it's still OK for this tutorial.)

Make sure pytheos works, using this simple example:

In [ ]:
import numpy as np
from pytheos import BM3Model

v0, k0, k0p = 10, 200, 4

exp_bm3 = BM3Model()
v_data = v0 * np.linspace(0.6, 1, 20)

fit_params = exp_bm3.make_params(v0=v0, k0=k0, k0p=k0p)
p_bm3 = exp_bm3.eval(fit_params, v=v_data)
p_data = p_bm3 + np.random.normal(0.0, 2, 20)

fit_params['v0'].vary = False
fitresult_bm3 = exp_bm3.fit(p_data, fit_params, v=v_data, weights=None)

print(fitresult_bm3.fit_report())

If there were no errors at the previous step, let's continue.

We will download the available isothermal bulk moduli and their pressure derivatives from the MPDS.

In [ ]:
!pip install mpds_client>=0.0.17
In [ ]:
import warnings
import pandas as pd
import numpy as np
from numpy.linalg import det
from ase.geometry import cellpar_to_cell

from mpds_client import MPDSDataRetrieval

Note, if your API key isn't valid, the API returns an HTTP error 403.

In [ ]:
client = MPDSDataRetrieval()

dfrm_k0p = client.get_dataframe({"classes": "binary", "elements": "O", "props": "pressure derivative of isothermal bulk modulus"})
dfrm_k0p = dfrm_k0p[np.isfinite(dfrm_k0p['Phase'])] # only data for the existing distinct phases
avg_k0p =  dfrm_k0p.groupby('Phase')['Value'].median().to_frame().reset_index().rename(columns={'Value': 'avg_k0p'})

dfrm_k0 = client.get_dataframe({"props": "isothermal bulk modulus"}, phases=set(dfrm_k0p['Phase'].tolist()))
avg_k0 =  dfrm_k0.groupby('Phase')['Value'].median().to_frame().reset_index().rename(columns={'Value': 'avg_k0'})
avg_k0 =  avg_k0.merge(avg_k0p, how='inner', on='Phase')

Let's fit the isothermal bulk moduli and their pressure derivatives from the MPDS pV-data and compare with the experimental values:

In [ ]:
def get_cell_volume(a, b, c, alpha, beta, gamma):
    '''
    Calculate V from cell parameters.
    NB ab_normal and a_direction are standard.
    '''
    return abs(
        det(
            cellpar_to_cell([a, b, c, alpha, beta, gamma])
        )
    )
In [ ]:
pvts = {}

for matrix in client.get_data(
    {"props": "cell parameters - pressure diagram"},
    phases=set(dfrm_k0p['Phase'].tolist()), # only those phases we have experimental bulk modulus
    fields={}                               # all fields
):
    try: decks = matrix['sample']['measurement'][0]['property']['matrix']
    except KeyError:
        warnings.warn('Error: there is no expected property in the entry %s' % matrix)
        continue

    p_all, v_all, t_all = [], [], []
    for deck in decks:
        p, t, v = deck[0], deck[1], get_cell_volume(*deck[2:])
        p_all.append(p)
        v_all.append(v)
        t_all.append(t)

    # TODO here in principle we should do something smarter
    # than just omitting the data for the same phase
    if matrix['sample']['material']['phase_id'] in pvts and len(pvts[ matrix['sample']['material']['phase_id'] ][0]) > len(p_all):
            warnings.warn('Skipping entry %s' % matrix['sample']['material']['entry'])
            continue

    pvts[ matrix['sample']['material']['phase_id'] ] = [matrix['sample']['material']['entry'], p_all, v_all, t_all]

pvts = [[key] + value for key, value in pvts.items()]
pvts = pd.DataFrame(pvts, columns=['Phase', 'Entry', 'P', 'V', 'T'])
print(pvts)
In [ ]:
dfrm = avg_k0.merge(pvts, how='inner', on='Phase')

for n, system in dfrm.iterrows():

    params = exp_bm3.make_params(v0=system['V'][0], k0=system['avg_k0'], k0p=system['avg_k0p'])

    fitresult_bm3 = exp_bm3.fit(system['P'], params, v=system['V'], weights=None)
    if abs(fitresult_bm3.params['k0'].value - system['avg_k0']) > 50:
        # show the discrepancies, if any
        print("*" * 30 + " Distinct phase https://mpds.io/#phase_id/%s " % system['Phase'] + "*" * 30)
        print("BM_fit: %4.1f 	 BM_exp: %4.1f" % (fitresult_bm3.params['k0'].value, system['avg_k0'])
        print("BM0p_fit: %1.2f 	 BM0p_exp: %1.2f" % (fitresult_bm3.params['k0p'].value, system['avg_k0p'])

Were you able to follow everything? Please, try to answer:

  • Why did the discrepancies occur?
  • What other thermodynamic properties can be calculated via EoS?
  • Given a certain (e.g. unfamiliar and hypothetical) crystalline structure, how can we calculate its isothermal bulk modulus? What about its adiabatic bulk modulus?

PS don't forget to invalidate (revoke) your API key.