Welcome to Treepace tutorial. First, we import the library:

In [1]:

```
from treepace import *
```

The basic unit of all trees is a node.

In [2]:

```
Node("label")
```

Out[2]:

In Treepace, any object (not only a string) can become a label of the node.

In [3]:

```
from glob import glob
from IPython.display import display
with open(glob('*.ipynb')[0], 'rb') as file_handle:
display(Node(file_handle))
```

A node has children, which can have other children...

In [4]:

```
root = Node('root',
[Node('c1'), Node('c2',
[Node('subchild')])])
```

A tree is defined by the reference to the root node.

In [5]:

```
Tree(root)
```

Out[5]:

It is possible to load and save a tree to various formats like tab-indented / parenthesized text or XML.

In [6]:

```
print(Tree.load('root (element1 (sub-element) element2)').save(IndentedText))
```

A subtree is a connected part of the tree consisting of the selected nodes of the main tree (highlighted with blue).

In [7]:

```
Subtree([root, root.children[1]])
```

Out[7]:

As we will see later, searching methods return `Match`

objects. Each match consists of groups (subtrees), where the group 0 represents the whole match – just like in a regex. In this tutorial, it will be highlighted with green color.

In [8]:

```
c2 = root.children[1]
Match([Subtree([c2, c2.parent]),
Subtree([c2])
])
```

Out[8]:

To search for a pattern anywhere in the tree, use the `search()`

method. The result is a list of matches.

The most basic pattern is a dot which matches one arbitrary node.

In [9]:

```
tree = Tree.load('a (b c)')
tree.search('.')
```

Out[9]:

A text literal matches the nodes whose string representation is equal to the given literal.

In [10]:

```
tree.search('a')
```

Out[10]:

A pattern can contain arbitrary Python code, enclosed in square brackets. The expression is evaluated for each relevant node (accessible in the expression via the variable `node`

) and matches if its result equals `True`

.

In [11]:

```
tree.search('[node.value != "c"]')
```

Out[11]:

An underscore is a shortcut for `node.value`

.

In [12]:

```
tree.search('[_.upper() == "C"]')
```

Out[12]:

Multiple node patterns can be connected using relations. In the following example, we search for a node 'a' which has a child 'b'. The whole subtree is returned – not only the final component.

In [13]:

```
tree.search('a < b')
```

Out[13]:

Other availabe relations are: immediately following sibling (`,`

), any sibling (`&`

) and parent (`>`

).

In [14]:

```
tree.search('a < b, c')[0]
```

Out[14]:

In [15]:

```
tree.search('a < c & b')[0]
```

Out[15]:

The 'parent' relationship is implicitly followed by a 'match any node' pattern. This is useful to form queries like this:

In [16]:

```
Tree.load('a (b (c) d (e))').search('a < b <c>, d')[0]
```

Out[16]:

To mark a part of the match as a group, use brackets. The groups are numbered from 1 and can be nested.

In [17]:

```
tree.search('{a < {b}, {c}}')[0]
```

Out[17]:

It is possible to back-reference saved groups by `$n`

.

In [18]:

```
Tree.load('m (n (o) m (n))').search('{m < n}, $1')[0]
```

Out[18]:

More complicated relationship between the nodes in a match can be expressed using back-references in a predicate.

In [19]:

```
nums = Tree(Node(1, [Node(-1), Node(0.5)]))
match = nums.search('{[_ != 2]} < [abs(_) == $1]')
match[0].group(0)
```

Out[19]:

To assert that the match must begin exactly at the root node, use the `match()`

method.

In [20]:

```
Tree.load('node (node (node))').match('node < node')
```

Out[20]:

If the match must cover all nodes of the tree, the `fullmatch()`

method can be called. This is useful for validation.

In [21]:

```
fruits = Tree.load('fruits (apple pear apple)')
display(fruits)
if fruits.fullmatch('fruits < apple & pear'):
print('The stock contains at least one apple and pear, but no other fruit.')
else:
print('The condition is not met.')
```

The `replace()`

method substitutes all matches of the pattern with the given replacement. Although it is not necessary, we will first search for the pattern (for illustration):

In [22]:

```
shop = Tree.load('shop (item (bread) item (water) item (roll) item (water))')
pattern = '{item} < water'
display(shop.search(pattern))
```

The actual replacement is simple:

In [23]:

```
shop.replace(pattern, '$1 < juice')
display(shop)
```

The transformation consists of one or more rules in the form: `pattern -> replacement`

. Each rule is repeated until a match is found. In addition, the whole list of rules is repeatead while at least one rule finds a match. To illustrate this behavior, the following transformation is performed:

In [24]:

```
subject = Tree.load('a (b)')
print('Original:')
display(subject)
subject.transform('''x -> y
a -> x''')
print('Transformed:')
display(subject)
```

A more useful transformation follows. Here is a sample XML document:

In [25]:

```
text = '''<article>
<heading>An example</heading>
<content>
<calc>
<plus>
<elem>3</elem>
<elem>4</elem>
</plus>
</calc>
</content>
</article>'''
doc = Tree.load(text, XmlText)
doc
```

Out[25]:

We will replace a semantic document representation with its visual HTML form and solve a mathematical expression.

In [26]:

```
doc.transform('''
article -> html < body
heading -> h1
content -> p
calc < plus < elem<{.}>, elem<{.}> -> [text(num($1) + num($2))]
''')
display(doc)
print(doc.save(XmlText))
```

This concludes the tutorial. You can install the library by running

```
py -m pip install treepace
```

on Windows or

```
pip install treepace
```

on Linux.