#!/usr/bin/env python # coding: utf-8 # # Some system statistics (Nestle1904GBI) # ## Table of content # * 1 - Introduction # * 2 - Load Text-Fabric app and data # * 3 - Performing the queries # * 3.1 - Print the Text-Fabric version # * 3.2 - Dump selection of header # * 3.3 - Memory footprint # * 3.4 - List loaded features # * 3.5 - Statistics on node types # * 3.6 - Node number ranges # * 4 - Required libraries # # 1 - Introduction # ##### [Back to TOC](#TOC) # # This Jupyter Notebook showcases several examples of statistical analysis performed on a Text-Fabric corpus. # # 2 - Load Text-Fabric app and data # ##### [Back to TOC](#TOC) # In[1]: get_ipython().run_line_magic('load_ext', 'autoreload') get_ipython().run_line_magic('autoreload', '2') # In[2]: # Loading the Text-Fabric code # Note: it is assumed Text-Fabric is installed in your environment from tf.fabric import Fabric from tf.app import use # In[3]: # load the N1904 app and data N1904 = use ("tonyjurg/Nestle1904GBI", version="0.4", hoist=globals()) # In[4]: # The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display with notebook viewer) N1904.dh(N1904.getCss()) # # 3 - Performing the queries # ##### [Back to TOC](#TOC) # ## 3.1 - Print the Text-Fabric version # ##### [Back to TOC](#TOC) # # Although this is somewhat trivial, this example does serve a purpose. We will print te version by means of calling the Text-Fabric parameter [VERSION](https://annotation.github.io/text-fabric/tf/parameters.html#tf.parameters.VERSION) which is fixed for the whole programm. To access any of these parameters in our notebook, it first needs to be imported from `tf.parameters`. # In[5]: from tf.parameters import VERSION print ('TextFabric version: {}'.format(VERSION)) # Note that any other parameters can be dumped in similar manner. # In[6]: N1904.showContext(...) # ## 3.2 - Dump selection of header # ##### [Back to TOC](#TOC) # # In this example the header of the loaded Text-Fabric dataset is dumped. This is done by means of an API call to [`A.header()`](https://annotation.github.io/text-fabric/tf/advanced/links.html#tf.advanced.links.header). # # Please note that in the example below `A` is replaced by `N1904`. This is result of the method of incantation: # > N1904 = use (... *etc* ... ) # # The [`use`](https://annotation.github.io/text-fabric/tf/app.html#tf.app.use) function returns an oject whose attributes and methods constitute the advanced API. In the # # # # In[7]: N1904.header(allMeta=False) # ## 3.3 - Memory footprint # ##### [Back to TOC](#TOC) # # The following API call [`footprint`](https://annotation.github.io/text-fabric/tf/core/api.html#tf.core.api.Api.footprint) provides a nicely formatted overview of memory footprint for each of the features in the Text_fabric corpus. # In[8]: TF.footprint() # ## 3.4 - List loaded features # ##### [Back to TOC](#TOC) # # The API call [`A.isLoaded()`](https://annotation.github.io/text-fabric/tf/core/api.html#tf.core.api.Api.isLoaded) will show information about loaded features. # In[9]: N1904.isLoaded() # ## 3.5 - Statistics on node types # ##### [Back to TOC](#TOC) # # This example will show various statistics on node types. The call to [`C.levels.data`](https://annotation.github.io/text-fabric/tf/core/prepare.html#tf.core.prepare.levels) results in list of ordered tuples which will be nicely displayed using the tabulate function. # In[10]: # Library to format table from tabulate import tabulate headers = ["Node", "Avarage # of slots","first","last"] ResultList= C.levels.data print(tabulate(ResultList, headers=headers, tablefmt='fancy_grid')) # ## 3.6 - Node number ranges # ##### [Back to TOC](#TOC) # In[11]: for NodeType in F.otype.all: print (NodeType, F.otype.sInterval(NodeType)) # Note that the ranges shown as output of this command are (except, possibly with repect to order) the same as found in file `otype.tf`: # > # ```@node # @TextFabric version=11.4.10 # ... # @valueType=str # @writtenBy=Text-Fabric # @dateWritten=2023-06-19T16:21:20Z # # 1-137779 word # 137780-137806 book # 137807-138066 chapter # 138067-154190 clause # 154191-226864 phrase # 226865-232584 sentence # 232585-240527 verse # ``` # # 4 - Required libraries # ##### [Back to TOC](#TOC) # # The scripts in this notebook require (beside `text-fabric`) the following Python libraries to be installed in the environment: # # tabulate # # You can install any missing library from within Jupyter Notebook using either`pip` or `pip3`. # In[ ]: