Welcome to the tutorial for ChemCoord (

The manipulation of the coordinates is a lot easier, if you can view them on the fly. So please install a molecule viewer, which opens xyz-files. A non complete list includes: molcas gv, avogadro, vmd, and pymol


In [1]:
import chemcoord as cc
from chemcoord.xyz_functions import get_rotation_matrix
import numpy as np
import time
In [2]:
water = cc.Cartesian.read_xyz('', start_index=1)
small = cc.Cartesian.read_xyz('', start_index=1)
middle = cc.Cartesian.read_xyz('', start_index=1)

Let's have a look at it:

In [3]:
atom x y z
1 O 0.000000 0.0 0.000000
2 H 0.758602 0.0 0.504284
3 H 0.260455 0.0 -0.872893
4 O 3.000000 0.5 0.000000
5 H 3.758602 0.5 0.504284
6 H 3.260455 0.5 -0.872893

It is also possible to open it with an external viewer. I use Molcas gv.exe so you have to change it accordingly to your program of choice.

In [4]:

To make this setting permament, execute:

In [5]:
cc.settings['defaults']['viewer'] = 'gv.exe'  # replace by your viewer of choice


The slicing operations are the same as for pandas.DataFrames. (

If the 'x' axis is of particular interest you can slice it out with:

In [6]:
# or explicit label based indexing
water.loc[:, 'x']
# or explicit integer based indexing
water.iloc[:, 1]
1    0.000000
2    0.758602
3    0.260455
4    3.000000
5    3.758602
6    3.260455
Name: x, dtype: float64

With boolean slicing it is very easy to cut all the oxygens away:

In [7]:
water[water['atom'] != 'O'].view()

This can be combined with other selections:

In [8]:
water[(water['atom'] != 'O') & (water['x'] < 1)].view()

Returned type

The indexing behaves like Indexing and Selecting data in Pandas. You can slice with Cartesian.loc[key], Cartesian.iloc[keys], and Cartesian[key]. The only question is about the return type. If the information in the columns is enough to draw a molecule, an instance of the own class (e.g. Cartesian) is returned. If the information in the columns is not enough to draw a molecule, there are two cases to consider:

  • A pandas.Series instance is returned for one dimensional slices
  • A pandas.DataFrame instance is returned in all other cases:

      molecule.loc[:, ['atom', 'x', 'y', 'z']] returns a `Cartesian`.
      molecule.loc[:, ['atom', 'x']]`` returns a `pandas.DataFrame`.
      molecule.loc[:, 'atom']`` returns a `pandas.Series`.

Sideeffects and Method chaining

Two general rules are:

  1. All functions are sideeffect free unless stated otherwise in the documentation.
  2. Where possible the methods return an instance of the own class, to allow method chaining.

Have a look at the unmodified molecule

In [9]:

Chain the methods:

In [10]:
middle.cut_sphere(radius=5, preserve_bonds=False).view()

The molecule itself remains unchanged.

In [11]:

Chemical bonds

One really important method is get_bonds(). It returns a connectivity table, which is represented by a dictionary. Each index points to set of indices, that are connected to it.

In [12]:
{1: {2, 3}, 2: {1}, 3: {1}, 4: {5, 6}, 5: {4}, 6: {4}}

Now the focus switches to another molecule (MIL53_middle)

Let's explore the coordinationsphere of the Cr atom with the index 7.

In [13]:
for i in range(3):
    middle.get_coordination_sphere(13, n_sphere=i, only_surface=False).view()

Binary operators

Mathematical Operations:

Binary operators are supported in the logic of the scipy stack, but you need python3.x for using the matrix multiplication operator @.

The general rule is that mathematical operations using the binary operators (+ - * / @) and the unary operatos (+ - abs) are only applied to the ['x', 'y', 'z'] columns.

Addition/Subtraction/Multiplication/Division: If you add a scalar to a Cartesian it is added elementwise onto the ['x', 'y', 'z'] columns. If you add a 3-dimensional vector, list, tuple... the first element of this vector is added elementwise to the 'x' column of the Cartesian instance and so on. The last possibility is to add a matrix with shape=(len(Cartesian), 3) which is again added elementwise. The same rules are true for subtraction, division and multiplication.

Matrixmultiplication: Only leftsided multiplication with a matrix of shape=(n, 3), where n is a natural number, is supported. The usual usecase is for example np.diag([1, 1, -1]) @ cartesian_instance to mirror on the x-y plane. Note that if A is the matrix which is multiplied from the left, and X is the shape=(n, 3)-matrix consisting of the ['x', 'y', 'z'] columns. The actual calculation is: (A @ X.T).T

In [14]:
(water + 3).view()
In [15]:
(get_rotation_matrix([1, 0, 0], np.radians(90)) @ water).view()
# If you use python2.x the @ operator is not supported. then you have to use


The comparison operators == and != are supported and require molecules indexed in the same way:

In some cases it is better to test for numerical equality $ |a - b| < \epsilon$. This is done using allclose or isclose (elementwise)

In [16]:
water == water + 1e-15
atom x y z
1 True False False False
2 True False False False
3 True False False False
4 True False False False
5 True False False False
6 True False False False
In [17]:
cc.xyz_functions.isclose(water, water + 1e-15)
atom x y z
1 True True True True
2 True True True True
3 True True True True
4 True True True True
5 True True True True
6 True True True True
In [18]:
cc.xyz_functions.allclose(water, water + 1e-15)

Symbolic evaluation

It is possible to use symbolic expressions from sympy.

In [19]:
import sympy
x = sympy.Symbol('x')
In [20]:
symb_water = water.copy()
In [21]:
symb_water['x'] = [x + i for i in range(len(symb_water))]
In [22]:
atom x y z
1 O $x$ 0.0 0.000000
2 H $x + 1$ 0.0 0.504284
3 H $x + 2$ 0.0 -0.872893
4 O $x + 3$ 0.5 0.000000
5 H $x + 4$ 0.5 0.504284
6 H $x + 5$ 0.5 -0.872893
In [23]:
symb_water.subs(x, 2)
atom x y z
1 O 2.0 0.0 0.000000
2 H 3.0 0.0 0.504284
3 H 4.0 0.0 -0.872893
4 O 5.0 0.5 0.000000
5 H 6.0 0.5 0.504284
6 H 7.0 0.5 -0.872893
In [24]:
symb_water.subs(x, 2).view()


In [25]:
moved = get_rotation_matrix([1, 2, 3], 1.1) @ middle + 15
In [26]:
In [27]:
m1, m2 = middle.align(moved)
In [28]:
cc.xyz_functions.view([m1, m2])
# If your viewer of choice does not support molden files, you have to call separately:
# m1.view()
# m2.view()


It is possible to detect the point group and symmetrize a molecule. Let's distort a $C_{2,v}$ symmetric molecule and symmetrize it back:

In [29]:
dist_molecule = small.copy()
dist_molecule += np.random.randn(len(dist_molecule), 3) / 25
In [30]:
In [31]:
eq = dist_molecule.symmetrize(max_n=25, tolerance=0.3, epsilon=1e-5)
In [32]:
In [33]:
a, b = small.align(dist_molecule)
a, c = small.align(eq['sym_mol'])
d1 = (a - b).get_distance_to()
d2 = (a - c).get_distance_to()
In [34]:
cc.xyz_functions.view([a, b, c])
# If your viewer of choice does not support molden files, you have to call separately:
# a.view()
# b.view()
# c.view()

As we can see, the symmetrised molecule is a lot more similar to the original molecule. The average deviation from the original positions decreased by 35 %.

In [35]:
(d1['distance'].sum() - d2['distance'].sum()) / d1['distance'].sum()
In [ ]: