Zmatrices

In [1]:
import chemcoord as cc
import time
In [2]:
water = cc.Cartesian.read_xyz('water_dimer.xyz', start_index=1).get_zmat()
small = cc.Cartesian.read_xyz('MIL53_small.xyz', start_index=1).get_zmat()

Let's have a look at it:

In [3]:
water
Out[3]:
Zmat
atom b bond a angle d dihedral
1 O $\vec{0}$ 0.000000 $\vec{e_z}$ 0.000000 $\vec{e_x}$ 0.000000
2 H 1 0.910922 $\vec{e_z}$ 56.385853 $\vec{e_x}$ -0.000000
3 H 1 0.910922 2 107.000024 $\vec{e_x}$ -0.000000
4 O 2 2.351206 1 132.466298 3 -16.755013
5 H 4 0.910922 2 132.466298 1 -180.000000
6 H 4 0.910922 5 107.000024 2 163.244987

This is the normal zmatrix with the only difference being that the upper right triangle is filled with references to the origin and canonical unit vectors. Chemically speaking this fixes translational and rotational degrees of freedom and allows to preserve the information about the absolute position in space.

In this tutorial we concentrate on operations you can do with zmatrices.

Keep in mind, that there is an own tutorial dedicated to transformation from cartesian space to internal coordinates and back.

Slicing

The slicing operations are the same as for pandas.DataFrames. (http://pandas.pydata.org/pandas-docs/stable/indexing.html)

If the 'x' axis is of particular interest you can slice it out with:

In [4]:
water['bond']
# or explicit label based indexing
water.loc[:, 'bond']
# or explicit integer based indexing
water.iloc[:, 2]
Out[4]:
1    0.000000
2    0.910922
3    0.910922
4    2.351206
5    0.910922
6    0.910922
Name: bond, dtype: float64

Now it is very easy to e.g. count the atoms:

In [5]:
water['atom'].value_counts()
Out[5]:
H    4
O    2
Name: atom, dtype: int64

Returned type

The indexing behaves like Indexing and Selecting data in Pandas. You can slice with Zmat.loc[key], Zmat.iloc[keys], and Zmat[key]. The only question is about the return type. If the information in the columns is enough to draw a molecule, an instance of the own class (e.g. Zmat) is returned. If the information in the columns is not enough to draw a molecule, there are two cases to consider:

  • A pandas.Series instance is returned for one dimensional slices
  • A pandas.DataFrame instance is returned in all other cases:

      molecule.loc[:, ['atom', 'b', 'bond', 'a', 'angle', 'd', 'dihedral']] returns a `Zmat`.
    
      molecule.loc[:, ['atom', 'b']]`` returns a `pandas.DataFrame`.
    
      molecule.loc[:, 'atom']`` returns a `pandas.Series`.

If rows are omitted, there is never a Zmat instance returned.

Assignments

There exist four different methods to perform assignments: safe_loc, unsafe_loc, safe_iloc, and unsafe_iloc.

As in pandas safe_loc and unsafe_loc are used for label based indexing and safe_iloc and unsafe_iloc for row based indexing.

The safe methods assert that assignments do not lead to zmatrices, which can't be transformed back to cartesian coordinates. They also insert dummy atoms where necessary.

The unsafe methods are wrappers around their pandas.DataFrame counterparts and are a lot faster than the safe assignments.

Unless there is a good (performance) reason, it is recommended to use the safe assignment methods!

Symbolic evaluation

It is possible to use symbolic expressions from sympy.

In [6]:
import sympy
sympy.init_printing()
d = sympy.Symbol('d')
In [7]:
symb_water = water.copy()
In [8]:
symb_water.safe_loc[4, 'bond'] = d
In [9]:
symb_water
Out[9]:
Zmat
atom b bond a angle d dihedral
1 O $\vec{0}$ 0 $\vec{e_z}$ 0.000000 $\vec{e_x}$ 0.000000
2 H 1 0.910922 $\vec{e_z}$ 56.385853 $\vec{e_x}$ -0.000000
3 H 1 0.910922 2 107.000024 $\vec{e_x}$ -0.000000
4 O 2 $d$ 1 132.466298 3 -16.755013
5 H 4 0.910922 2 132.466298 1 -180.000000
6 H 4 0.910922 5 107.000024 2 163.244987
In [10]:
symb_water.subs(d, 2)
Out[10]:
Zmat
atom b bond a angle d dihedral
1 O $\vec{0}$ 0.000000 $\vec{e_z}$ 0.000000 $\vec{e_x}$ 0.000000
2 H 1 0.910922 $\vec{e_z}$ 56.385853 $\vec{e_x}$ -0.000000
3 H 1 0.910922 2 107.000024 $\vec{e_x}$ -0.000000
4 O 2 2.000000 1 132.466298 3 -16.755013
5 H 4 0.910922 2 132.466298 1 -180.000000
6 H 4 0.910922 5 107.000024 2 163.244987
In [11]:
cc.xyz_functions.view([symb_water.subs(d, i).get_cartesian() for i in range(2, 5)])

# Uncomment if viewer cannot open molden files
# for i in range(2, 5):
#     symb_water.subs(d, i).get_cartesian().view()
#     time.sleep(1)

Binary operators

Mathematical Operations:

The general rule is that mathematical operations using the binary operators (+ - * /) and the unary operators (+ - abs) are only applied to the ['bond', 'angle', 'dihedral'] columns.

Addition/Subtraction/Multiplication/Division: The most common case is to add another Zmat instance. In this case it is tested, if the used references are the same. Afterwards the addition in the ['bond', 'angle', 'dihedral'] columns is performed. If you add a scalar to a Zmat it is added elementwise onto the ['bond', 'angle', 'dihedral'] columns. If you add a 3-dimensional vector, list, tuple... the first element of this vector is added elementwise to the 'bond' column of the Zmat instance and so on. The third possibility is to add a matrix with shape=(len(Zmat), 3) which is again added elementwise. The same rules are true for subtraction, division and multiplication.

In [12]:
distortion = water.copy()
distortion.unsafe_loc[:, ['bond', 'angle', 'dihedral']] = 0
distortion.safe_loc[4, 'bond'] = d
In [13]:
water + distortion
Out[13]:
Zmat
atom b bond a angle d dihedral
1 O $\vec{0}$ 0 $\vec{e_z}$ 0.000000 $\vec{e_x}$ 0.000000
2 H 1 0.910922 $\vec{e_z}$ 56.385853 $\vec{e_x}$ 0.000000
3 H 1 0.910922 2 107.000024 $\vec{e_x}$ 0.000000
4 O 2 $d + 2.3512055093207$ 1 132.466298 3 -16.755013
5 H 4 0.910922 2 132.466298 1 -180.000000
6 H 4 0.910922 5 107.000024 2 163.244987
In [14]:
(water + distortion).subs(d, 3)
Out[14]:
Zmat
atom b bond a angle d dihedral
1 O $\vec{0}$ 0.000000 $\vec{e_z}$ 0.000000 $\vec{e_x}$ 0.000000
2 H 1 0.910922 $\vec{e_z}$ 56.385853 $\vec{e_x}$ 0.000000
3 H 1 0.910922 2 107.000024 $\vec{e_x}$ 0.000000
4 O 2 5.351206 1 132.466298 3 -16.755013
5 H 4 0.910922 2 132.466298 1 -180.000000
6 H 4 0.910922 5 107.000024 2 163.244987
In [15]:
(water + distortion).subs(d, 3).get_cartesian().view()

Movements without small angular approximation

In [16]:
cartesians = cc.xyz_functions.read_molden('MIL53_ht_lt_movement.molden')
STEPS = 5

A typical approximation is the small angular approximation when calculating activation barriers for the transition from one allotrope to another.

Let's import the following allotropes:

In [17]:
m1, m2 = cartesians[0], cartesians[-1]

The easiest approach for obtaining structures for the movement from m1 to m2, is to linearize the movement in cartesian space.

For geometric reasons this leads to shortened bond lengths, when bond angles change. The larger the change in angles, the larger this effect which leads to an overestimation of the activation energy.

In [18]:
cc.xyz_functions.view([m1 + (m2 - m1) * i / STEPS for i in range(STEPS)])

# Uncomment if viewer cannot open molden files
# for molecule in [m1 + (m2 - m1) * i / STEPS for i in range(STEPS)]:
#     molecule.view()
#     time.sleep(1)

Another approach is to build two Zmatrices with the exact same references/construction table and linearise the movement in the space of internal coordinates.

In [19]:
zm1 = m1.get_zmat()
c_table = zm1.loc[:, ['b', 'a', 'd']]
zm2 = m2.get_zmat(c_table)

The term zm2 - zm1 is not convertible to cartesian coordinates. For this reason we have to switch of the testing for the validity of zmatrices when using operators.

The minimize_dihedrals method chooses the minimal absolute value representation of an angle equivalence class. So it will move e.g. by -30° instead of 270°.

In [20]:
with cc.TestOperators(False):
    zmats = [zm1 + (zm2 - zm1).minimize_dihedrals() * i / STEPS for i in range(STEPS)]

The resulting structures are a lot more reasonable for the interconversion of allotropes. (Look for example at the C-H distance in methyl groups).

In [21]:
cc.xyz_functions.view([x.get_cartesian() for x in zmats])

# Uncomment if viewer cannot open molden files
# for molecule in [x.get_cartesian() for x in zmats]:
#     molecule.view()
#     time.sleep(1)
In [ ]: