In [1]:

```
import chemcoord as cc
import time
import pandas as pd
```

In [2]:

```
water = cc.Cartesian.read_xyz('water_dimer.xyz', start_index=1)
zwater = water.get_zmat()
small = cc.Cartesian.read_xyz('MIL53_small.xyz', start_index=1)
```

The table which defines the used references of each atom will be called
**construction table**.

The contruction table of the zmatrix of the water dimer can be seen here:

In [3]:

```
zwater.loc[:, ['b', 'a', 'd']]
```

Out[3]:

The absolute references are indicated by magic strings: `['origin', 'e_x', 'e_y', 'e_z']`

.

- The atom which is to be set in the reference of three other atoms, is denoted $i$.
- The bond-defining atom is represented by $b$.
- The angle-defining atom is represented by $a$.
- The dihedral-defining atom is represented by $d$.

It is advantageous to treat a zmatrix simply as recursive spherical coordinates.

The $(n + 1)$-th atom uses three of the previous $n$ atoms as reference. Those three atoms ($b, a, d$) are spanning a coordinate system, if we require righthandedness. If we express the position of the atom $i$ in respect to this locally spanned coordinate system using spherical coordinates, we arrive at the usual definition of a zmatrix.

PS: The question about right- or lefthandedness is equivalent to specifying a direction of rotation. Chemcoord uses of course the IUPAC definition.

The ideal (and luckily most common) case is, that $\vec{ib}$, $\vec{ba}$, and $\vec{ad}$ are linearly independent. In this case there exist a bijective mapping between spherical coordinates and cartesian coordinates and all angles, positions... are well defined.

One pathologic case appears, if $\vec{ib}$ and $\vec{ba}$ are linear dependent.

This means, that the angle in the zmatrix is either $0^\circ$ or $180^\circ$. In this case there are infinitely many dihedral angles for the same configuration in cartesian space. Or to say it in a more analytical way: The transformation from spherical coordinates to cartesian coordinates is surjective, but not injective.

For nearly all cases (e.g. expressing the potential hyper surface in terms of internal coordinates), the surjectivity property is sufficient.

A lot of other problematic cases can be automatically solved by assigning a default value to the dihedral angle by definition ($0^\circ$ in the case of chemcoord).

Usually the user does not need to think about this case, which is automatically handled by chemcoord.

The real pathologic case appears, if the three reference atoms are linear.

It is important to note, that this is not a problem in the spherical coordinates of `i`

.
The coordinate system itself which is spanned by `b`

, `a`

and `d`

is undefined.
This means, that it is not visible directly from the values in the Zmatrix, if `i`

uses an invalid reference.

I will use the term valid Zmatrix if all atoms `i`

have valid references. In this case the transformation to cartesian coordinates is well defined.

Now there are two cases:

Chemcoord asserts, that the Zmatrix which is created from cartesian coordinates using `get_zmat`

is a valid Zmatrix (or raises an explicit exception if it fails at finding valid references.) This is always done by choosing other references (instead of introducing dummy atoms.)

If a valid Zmatrix is manipulated after creation, it might occur because of an assignment, that `b`

, `a`

, and `d`

are moved into a linear relationship. In this case a dummy atom is inserted which lies in the plane which was spanned by `b`

, `a`

, and `d`

before the assignment. It uses the same references as the atom `d`

, so changes in the references of `b`

, `a`

and `d`

are also present in the position of the dummy atom `X`

.
This is done using the safe assignment methods of chemcoord.

In [4]:

```
water = water - water.loc[5, ['x', 'y', 'z']]
zmolecule = water.get_zmat()
c_table = zmolecule.loc[:, ['b', 'a', 'd']]
c_table.loc[6, ['a', 'd']] = [2, 1]
zmolecule1 = water.get_zmat(construction_table=c_table)
zmolecule2 = zmolecule1.copy()
zmolecule3 = water.get_zmat(construction_table=c_table)
```

In [5]:

```
angle_before_assignment = zmolecule1.loc[4, 'angle']
```

In [6]:

```
zmolecule1.safe_loc[4, 'angle'] = 180
```

In [7]:

```
zmolecule1.safe_loc[5, 'dihedral'] = 90
```

In [8]:

```
zmolecule1.safe_loc[4, 'angle'] = angle_before_assignment
```

In [9]:

```
xyz1 = zmolecule1.get_cartesian()
```

In [10]:

```
xyz1.view()
```

With the following contextmanager we can switch the automatic insertion of dummy atoms of and look at the cartesian which is built after assignment of `.safe_loc[4, 'angle'] = 180`

. It is obvious from the structure, that the coordinate system spanned by `O - H - O`

is undefined. This was the second pathological case in the mathematical introduction.

In [11]:

```
with cc.DummyManipulation(False):
try:
zmolecule3.safe_loc[4, 'angle'] = 180
except cc.exceptions.InvalidReference as e:
e.already_built_cartesian.view()
```

It is possible to use symbolic expressions from sympy.

In [12]:

```
import sympy
sympy.init_printing()
d = sympy.Symbol('d')
```

In [13]:

```
symb_water = zwater.copy()
```

In [14]:

```
symb_water.safe_loc[4, 'bond'] = d
```

In [15]:

```
symb_water
```

Out[15]:

In [16]:

```
symb_water.subs(d, 2)
```

Out[16]:

In [17]:

```
cc.xyz_functions.view([symb_water.subs(d, i).get_cartesian() for i in range(2, 5)])
# If your viewer cannot open molden files you have to uncomment the following lines
# for i in range(2, 5):
# symb_water.subs(d, i).get_cartesian().view()
# time.sleep(1)
```

The construction table in chemcoord is represented by a pandas DataFrame with the columns `['b', 'a', 'd']`

which can be constructed manually.

In [18]:

```
pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]], columns=['b', 'a', 'd'])
```

Out[18]:

It is possible to specify only the first $i$ rows of a Zmatrix, in order to compute the $i + 1$ to $n$ rows automatically. If the molecule consists of unconnected fragments, the construction tables are created independently for each fragment and connected afterwards. It is important to note, that an unfragmented, monolithic molecule is treated in the same way. It just consists of one fragment. This means that in several methods where a list of fragments is returned or taken, an one element list appears.

If the Zmatrix is automatically created, the oxygen 1 is the first atom. Let's assume, that we want to change the order of fragments.

In [19]:

```
water.get_zmat()
```

Out[19]:

Let's fragmentate the water

In [20]:

```
fragments = water.fragmentate()
```

In [21]:

```
c_table = water.get_construction_table(fragment_list=[fragments[1], fragments[0]])
```

In [22]:

```
water.get_zmat(c_table)
```

Out[22]:

If we want to specify the order in the second fragment, so that it connects via the oxygen 1, it is important to note, that we have to specify the full row. **It is not possible to define just the order without the references**.

In [23]:

```
frag_c_table = pd.DataFrame([[4, 6, 5], [1, 4, 6], [1, 2, 4]], columns=['b', 'a', 'd'], index=[1, 2, 3])
```

In [24]:

```
c_table2 = water.get_construction_table(fragment_list=[fragments[1], (fragments[0], frag_c_table)])
```

In [25]:

```
water.get_zmat(c_table2)
```

Out[25]:

In [ ]:

```
```