#!/usr/bin/env python # coding: utf-8 # # # # # # Start # # This notebook gets you started with using # [Text-Fabric](https://github.com/annotation/text-fabric) for coding in the # letters of René Descartes. # # Familiarity with the underlying # [data model](https://annotation.github.io/text-fabric/tf/about/datamodel.html) # is recommended. # # For provenance, see the documentation: # [about](https://github.com/CLARIAH/descartes-tf/blob/master/docs/about.md). # ## Overview # # * we tell you how to get Text-Fabric on your system; # * we tell you how to get the Descartes corpus on your system. # ## Installing Text-Fabric # # See the [installation instructions](https://annotation.github.io/text-fabric/tf/about/install.html). # # Running Text-Fabric # # We will run computer code in the cells below, and this code makes use of the # text-fabric library, shortly called `tf`. # # We import some standard Python modules and then we import the `use` function from text-fabric. # In[1]: import sys, os from tf.app import use # Now we are going to *use* the `use` function. # We want to *use* a corpus, and if we specify what corpus, text-fabric will the data for us. # # If you have cloned the `CLARIAH/descartes-tf` repository to your local machine under the directory # # `~/github/CLARIAH/descartes-tf` # # then you already have the data. # In that case you have to call the use command like this: # # ``` # A = use("CLARIAH/descartes-tf:clone", checkout="clone", hoist=globals()) # ``` # # Below we give the command for the case where you have not cloned the repository. # Text-Fabric will fetch the data from the internet and store it in your directory # # `~/text-fabric-data/github/CLARIAH/descartes-tf`. # # In both cases, the corpus data will be optimised for fast processing, a one time job. # In[2]: A = use("CLARIAH/descartes-tf", hoist=globals()) # # The following loads will be much quicker! # # Just to show the results of the optimization step: if we give the same command again, # the data is loaded much quicker. # In[3]: A = use("CLARIAH/descartes-tf", hoist=globals()) # ### The output # # The messages after loading the corpus contain a lot of information about it. # # **Tip:** click the triangles and the links, and have a quick look. # # The **Text-Fabric** line has various links to the API docs. # # Under **Node types** you find statistics about the corpus. # # Under **Descartes = Descartes, all letters** you find the *features* of the corpus # with short descriptions. # # This corpus has additional material: *illustrations*. # They have been downloaded automatically in the process, and you see how many there are. # # Highlights # # This corpus is special in that it has mathematical formulas and illustrations. # # We show some of them to whet your appetite. # # ## Formulas # # There are simple formulas and complex formulas. # The latter are represented as TeX codes, and will be typeset nicely. # # Let's find the complex ones. # In[4]: query = """ formula notation=TeX """ results = A.search(query) # Let's show a few. # In[5]: A.table(results, end=3) # You can see them in context as well: # In[6]: A.show(results, end=3) # --- # # # Next steps # # By now you have an impression how to orient yourself in this corpus. # The next steps will show you how to get powerful: searching and computing. # # After that it is time for collecting results, use them in new annotations and share them. # # * **start** intro and highlights # * **[search](search.ipynb)** turbo charge your hand-coding with search templates # * **[compute](compute.ipynb)** sink down a level and compute it yourself # * **[exportExcel](exportExcel.ipynb)** make tailor-made spreadsheets out of your results # # Advanced # # * **[similar sentences](similar.ipynb)** find similar sentences # # CC-BY Dirk Roorda