This Jupyter Notebook provides detailed instructions on how to load the CenterBLC/N1904 Text-Fabric dataset into your Python environment. This will enable you to perform linguistic analysis on the Greek New Testament (Nestle 1904, 7th edition).
The CenterBLC/N1904 Text-Fabric dataset is available as a collection of files hosted on GitHub. The files in this dataset can be distinguised into two main types:
The feature data files are stored in the directory tf where each subdirectory maps to a specific version. Each version is accompanied with release information that can be viewed here.
The application related files are integral part of the Text-Fabric dataset and provide dataset-specific functionalities like viewtypes.
When invoking the latest version of the Text-Fabric dataset, the code downloads a single zip file instead of individual files. This file, 'complete.zip,' contains all the necessary files (and some bookkeeping files) for a specific release.
In case you want to load a specific version (other than the latest one) there may be the need to increase GitHub's rate limit. Instructions on how this can be achieved can be found in this Jupyter Notebook.
Before you can start using Text-Fabric, you need to set up a suitable Python environment (at least Python version 3.7.0). An example of installing a Python environment using Anaconda is demonstrated in this Jupyter Notebook. Further it is required to install the Text-Fabric package in this environment. Instructions on this are provided in this Jupyter Notebook. This setup process only needs to be done once. Afterward, the Text-Fabric code will be available for loading into your system's memory.
Besides keeping your Python environment updated, it is also advisable to periodically update your installed version of Text-Fabric to the latest or a more recent release. How to do this from within a Jupyter Notebook is demonstrated in this Notebook.
In certain situations (particularly when loading Text-Fabric datasets other than the latest version), it may also be necessary to increase the rate limit for GitHub. See this Notebook for more information.
The instructions in this section need to be executed each time you want to use Text-Fabric. They will first load the Text-Fabric code and then load the data into memory.
%load_ext autoreload
%autoreload 2
# Loading the Text-Fabric code
# Note: it is assumed Text-Fabric is installed in your environment
from tf.fabric import Fabric
from tf.app import use
The following invocation of function use()
loads all features of the corpus. It creates a datastructure (in this example N1904
) with associated methods and function. Collectively this is refered to as the 'Advanced API', in the 'cheat sheet' references to A.*something*
. The exact name is however determend during invocation by the use()
command. Hence, in this notebook references to this 'Advanced API' should be adressed as N1904
.
# load the N1904-TF app and data
N1904 = use ("CenterBLC/N1904", version="1.0.0", hoist=globals())
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
book | 27 | 5102.93 | 100 |
chapter | 260 | 529.92 | 100 |
verse | 7944 | 17.34 | 100 |
sentence | 8011 | 17.20 | 100 |
group | 8945 | 7.01 | 46 |
clause | 42506 | 8.36 | 258 |
wg | 106868 | 6.88 | 533 |
phrase | 69007 | 1.90 | 95 |
subphrase | 116178 | 1.60 | 135 |
word | 137779 | 1.00 | 100 |
3
CenterBLC/N1904
C:/Users/tonyj/text-fabric-data/github/CenterBLC/N1904/app
gdb630837ae89b9468c9e50d13bda05cfd3de4f18
''
[]
none
unknown
NA
:
text-orig-full
https://github.com/CenterBLC/N1904/tree/main/docs
about
https://github.com/CenterBLC/N1904
https://github.com/CenterBLC/N1904/blob/main/docs/features/<feature>.md
README
text-orig-full
}True
local
C:/Users/tonyj/text-fabric-data/github/CenterBLC/N1904/_temp
main
Nestle 1904 Greek New Testament
10.5281/zenodo.13117910
[]
CenterBLC
/tf
N1904
N1904
1.0.0
https://learner.bible/text/show_text/nestle1904/
Show this on the website
en
https://learner.bible/text/show_text/nestle1904/<1>/<2>/<3>
{webBase}/word?version={version}&id=<lid>
1.0.0
True
{typ} {function} {rela} \\ {cls} {role} {junction}
''
{typ} {function} {rela} \\ {typems} {role} {rule}
''
True
{typ} {function} {rela} \\ {typems} {role} {rule}
''
{typ} {function} {rela} \\ {role} {rule}
''
{typ} {function} {rela} \\ {typems} {role} {rule}
''
True
{book} {chapter}:{verse}
''
True
{typems} {role} {rule} {junction}
''
lemma
sp
gloss
]grc
Display is setup for viewtype syntax-view
See here for more information on viewtypes
The following code is optional. Its main function is to ensure the formatting of Text-Fabric objects, such as tables and syntax trees, is properly displayed in the online Notebook Viewer, matching the way it is shown in the Jupyter Notebook itself. It is using the getCss(app)
function to collect the complete CSS code from the TF and the app.
# The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display with notebook viewer)
N1904.dh(N1904.getCss())
Note: this is achieved by embedding the CSS code inside the notebook file. The content of the CSS code can be examined from this cells output (truncated):
{ "cell_type": "code", "execution_count": 7, "id": "932992c9-3fd9-4b5a-aa22-48eb376c8622", "metadata": {}, "outputs": [ { "data": { "text/html": [ "<style>tr.tf.ltr, td.tf.ltr, th.tf.ltr { text-align: left ! important;}\n", "tr.tf.rtl, td.tf.rtl, th.tf.rtl { text-align: right ! important;}\n", "@font-face {\n", " font-family: \"Gentium Plus\";\n", ... etc ...