Introduction¶

Welcome to the tutorial for ChemCoord (http://chemcoord.readthedocs.org/).

The manipulation of the coordinates is a lot easier, if you can view them on the fly. So please install a molecule viewer, which opens xyz-files. A non complete list includes: molcas gv, avogadro, vmd, and pymol

Cartesian¶

In [1]:

import chemcoord as cc
from chemcoord.xyz_functions import get_rotation_matrix
import numpy as np
import time

In [2]:

water = cc.Cartesian.read_xyz('water_dimer.xyz', start_index=1)
small = cc.Cartesian.read_xyz('MIL53_small.xyz', start_index=1)
middle = cc.Cartesian.read_xyz('MIL53_middle.xyz', start_index=1)

Let's have a look at it:

In [3]:

water

Out[3]:

Cartesian
	atom	x	y	z
1	O	0.000000	0.0	0.000000
2	H	0.758602	0.0	0.504284
3	H	0.260455	0.0	-0.872893
4	O	3.000000	0.5	0.000000
5	H	3.758602	0.5	0.504284
6	H	3.260455	0.5	-0.872893

It is also possible to open it with an external viewer. I use Molcas gv.exe so you have to change it accordingly to your program of choice.

In [4]:

water.view(viewer='gv.exe')

To make this setting permament, execute:

In [5]:

cc.settings['defaults']['viewer'] = 'gv.exe'  # replace by your viewer of choice

Slicing¶

The slicing operations are the same as for pandas.DataFrames. (http://pandas.pydata.org/pandas-docs/stable/indexing.html)

If the 'x' axis is of particular interest you can slice it out with:

In [6]:

water['x']
# or explicit label based indexing
water.loc[:, 'x']
# or explicit integer based indexing
water.iloc[:, 1]

Out[6]:

1    0.000000
2    0.758602
3    0.260455
4    3.000000
5    3.758602
6    3.260455
Name: x, dtype: float64

With boolean slicing it is very easy to cut all the oxygens away:

In [7]:

water[water['atom'] != 'O'].view()

This can be combined with other selections:

In [8]:

water[(water['atom'] != 'O') & (water['x'] < 1)].view()

Returned type¶

The indexing behaves like Indexing and Selecting data in Pandas. You can slice with Cartesian.loc[key], Cartesian.iloc[keys], and Cartesian[key]. The only question is about the return type. If the information in the columns is enough to draw a molecule, an instance of the own class (e.g. Cartesian) is returned. If the information in the columns is not enough to draw a molecule, there are two cases to consider:

A pandas.Series instance is returned for one dimensional slices

A pandas.DataFrame instance is returned in all other cases:

  molecule.loc[:, ['atom', 'x', 'y', 'z']] returns a `Cartesian`.

  molecule.loc[:, ['atom', 'x']]`` returns a `pandas.DataFrame`.

  molecule.loc[:, 'atom']`` returns a `pandas.Series`.

Sideeffects and Method chaining¶

Two general rules are:

All functions are sideeffect free unless stated otherwise in the documentation.
Where possible the methods return an instance of the own class, to allow method chaining.

Have a look at the unmodified molecule

In [9]:

middle.view()

Chain the methods:

In [10]:

middle.cut_sphere(radius=5, preserve_bonds=False).view()

The molecule itself remains unchanged.

In [11]:

middle.view()

Chemical bonds¶

One really important method is get_bonds(). It returns a connectivity table, which is represented by a dictionary. Each index points to set of indices, that are connected to it.

In [12]:

water.get_bonds()

Out[12]:

{1: {2, 3}, 2: {1}, 3: {1}, 4: {5, 6}, 5: {4}, 6: {4}}

Now the focus switches to another molecule (MIL53_middle)

Let's explore the coordinationsphere of the Cr atom with the index 7.

In [13]:

for i in range(3):
    middle.get_coordination_sphere(13, n_sphere=i, only_surface=False).view()
    time.sleep(1)

Binary operators¶

Mathematical Operations:¶

Binary operators are supported in the logic of the scipy stack, but you need python3.x for using the matrix multiplication operator @.

The general rule is that mathematical operations using the binary operators (+ - * / @) and the unary operatos (+ - abs) are only applied to the ['x', 'y', 'z'] columns.

Addition/Subtraction/Multiplication/Division: If you add a scalar to a Cartesian it is added elementwise onto the ['x', 'y', 'z'] columns. If you add a 3-dimensional vector, list, tuple... the first element of this vector is added elementwise to the 'x' column of the Cartesian instance and so on. The last possibility is to add a matrix with shape=(len(Cartesian), 3) which is again added elementwise. The same rules are true for subtraction, division and multiplication.

Matrixmultiplication: Only leftsided multiplication with a matrix of shape=(n, 3), where n is a natural number, is supported. The usual usecase is for example np.diag([1, 1, -1]) @ cartesian_instance to mirror on the x-y plane. Note that if A is the matrix which is multiplied from the left, and X is the shape=(n, 3)-matrix consisting of the ['x', 'y', 'z'] columns. The actual calculation is: (A @ X.T).T

In [14]:

(water + 3).view()

In [15]:

(get_rotation_matrix([1, 0, 0], np.radians(90)) @ water).view()
# If you use python2.x the @ operator is not supported. then you have to use xyz_functions.dot

Comparison:¶

The comparison operators == and != are supported and require molecules indexed in the same way:

In some cases it is better to test for numerical equality $ |a - b| < \epsilon$. This is done using allclose or isclose (elementwise)

In [16]:

water == water + 1e-15

Out[16]:

	atom	x	y	z
1	True	False	False	False
2	True	False	False	False
3	True	False	False	False
4	True	False	False	False
5	True	False	False	False
6	True	False	False	False

In [17]:

cc.xyz_functions.isclose(water, water + 1e-15)

Out[17]:

	atom	x	y	z
1	True	True	True	True
2	True	True	True	True
3	True	True	True	True
4	True	True	True	True
5	True	True	True	True
6	True	True	True	True

In [18]:

cc.xyz_functions.allclose(water, water + 1e-15)

Out[18]:

True

Symbolic evaluation¶

It is possible to use symbolic expressions from sympy.

In [19]:

import sympy
sympy.init_printing()
x = sympy.Symbol('x')

In [20]:

symb_water = water.copy()

In [21]:

symb_water['x'] = [x + i for i in range(len(symb_water))]

In [22]:

symb_water

Out[22]:

Cartesian
	atom	x	y	z
1	O	$x$	0.0	0.000000
2	H	$x + 1$	0.0	0.504284
3	H	$x + 2$	0.0	-0.872893
4	O	$x + 3$	0.5	0.000000
5	H	$x + 4$	0.5	0.504284
6	H	$x + 5$	0.5	-0.872893

In [23]:

symb_water.subs(x, 2)

Out[23]:

Cartesian
	atom	x	y	z
1	O	2.0	0.0	0.000000
2	H	3.0	0.0	0.504284
3	H	4.0	0.0	-0.872893
4	O	5.0	0.5	0.000000
5	H	6.0	0.5	0.504284
6	H	7.0	0.5	-0.872893

In [24]:

symb_water.subs(x, 2).view()

Alignment¶

In [25]:

moved = get_rotation_matrix([1, 2, 3], 1.1) @ middle + 15

In [26]:

moved.view()

In [27]:

m1, m2 = middle.align(moved)

In [28]:

cc.xyz_functions.view([m1, m2])
# If your viewer of choice does not support molden files, you have to call separately:
# m1.view()
# m2.view()

Symmetry¶

It is possible to detect the point group and symmetrize a molecule. Let's distort a $C_{2,v}$ symmetric molecule and symmetrize it back:

In [29]:

np.random.seed(77)
dist_molecule = small.copy()
dist_molecule += np.random.randn(len(dist_molecule), 3) / 25

In [30]:

dist_molecule.get_pointgroup(tolerance=0.1)

Out[30]:

C1

In [31]:

eq = dist_molecule.symmetrize(max_n=25, tolerance=0.3, epsilon=1e-5)

In [32]:

eq['sym_mol'].get_pointgroup(tolerance=0.1)

Out[32]:

C2v

In [33]:

a, b = small.align(dist_molecule)
a, c = small.align(eq['sym_mol'])
d1 = (a - b).get_distance_to()
d2 = (a - c).get_distance_to()

In [34]:

cc.xyz_functions.view([a, b, c])
# If your viewer of choice does not support molden files, you have to call separately:
# a.view()
# b.view()
# c.view()

As we can see, the symmetrised molecule is a lot more similar to the original molecule. The average deviation from the original positions decreased by 35 %.

In [35]:

(d1['distance'].sum() - d2['distance'].sum()) / d1['distance'].sum()

Out[35]:

$$0.34609079359070377$$

In [ ]: