Load the Text-Fabric dataset (N1904-TF)¶

Table of content (ToC)¶

1 - Introduction
2 - Load Text-Fabric into memory
3 - Notebook version details

1 - Introduction ¶

Back to ToC ¶

This Jupyter Notebook provides detailed instructions on how to load the CenterBLC/N1904 Text-Fabric dataset into your Python environment. This will enable you to perform linguistic analysis on the Greek New Testament (Nestle 1904, 7th edition).

1.1 - Text-Fabric data versions ¶

The CenterBLC/N1904 Text-Fabric dataset is available as a collection of files hosted on GitHub. The files in this dataset can be distinguised into two main types:

The feature data files are stored in the directory tf where each subdirectory maps to a specific version. Each version is accompanied with release information that can be viewed here.
The application related files are integral part of the Text-Fabric dataset and provide dataset-specific functionalities like viewtypes.

When invoking the latest version of the Text-Fabric dataset, the code downloads a single zip file instead of individual files. This file, 'complete.zip,' contains all the necessary files (and some bookkeeping files) for a specific release.

In case you want to load a specific version (other than the latest one) there may be the need to increase GitHub's rate limit. Instructions on how this can be achieved can be found in this Jupyter Notebook.

1.2 - Prerequisites / Installation ¶

Before you can start using Text-Fabric, you need to set up a suitable Python environment (at least Python version 3.7.0). An example of installing a Python environment using Anaconda is demonstrated in this Jupyter Notebook. Further it is required to install the Text-Fabric package in this environment. Instructions on this are provided in this Jupyter Notebook. This setup process only needs to be done once. Afterward, the Text-Fabric code will be available for loading into your system's memory.

Besides keeping your Python environment updated, it is also advisable to periodically update your installed version of Text-Fabric to the latest or a more recent release. How to do this from within a Jupyter Notebook is demonstrated in this Notebook.

In certain situations (particularly when loading Text-Fabric datasets other than the latest version), it may also be necessary to increase the rate limit for GitHub. See this Notebook for more information.

1.3 - Updates ¶

The following Jupyter Notebook discusses the various aspects of updating your Text-Fabric version to other releases.

2 - Load Text-Fabric into memory ¶

Back to ToC ¶

The instructions in this section need to be executed each time you want to use Text-Fabric. They will first load the Text-Fabric code and then load the data into memory.

2.1 - Load the Text-Fabric code ¶

In [2]:

%load_ext autoreload
%autoreload 2

In [4]:

# Loading the Text-Fabric code
# Note: it is assumed Text-Fabric is installed in your environment
from tf.fabric import Fabric
from tf.app import use

2.2 - Load Text-Fabric app and data ¶

The following invocation of function use() loads all features of the corpus. It creates a datastructure (in this example N1904) with associated methods and function. Collectively this is refered to as the 'Advanced API', in the 'cheat sheet' references to A.*something*. The exact name is however determend during invocation by the use() command. Hence, in this notebook references to this 'Advanced API' should be adressed as N1904.

In [8]:

# load the N1904-TF app and data
N1904 = use ("CenterBLC/N1904", version="1.0.0", hoist=globals())

Locating corpus resources ...

app: ~/text-fabric-data/github/CenterBLC/N1904/app

data: ~/text-fabric-data/github/CenterBLC/N1904/tf/1.0.0

TF: TF API 12.5.3, CenterBLC/N1904/app v3, Search Reference
Data: CenterBLC - N1904 1.0.0, Character table, Feature docs

Node types

Name	# of nodes	# slots / node	% coverage
book	27	5102.93	100
chapter	260	529.92	100
verse	7944	17.34	100
sentence	8011	17.20	100
group	8945	7.01	46
clause	42506	8.36	258
wg	106868	6.88	533
phrase	69007	1.90	95
subphrase	116178	1.60	135
word	137779	1.00	100

Sets: no custom sets
Features:

Nestle 1904 Greek New Testament

after

str

material after the end of the word

appositioncontainer

int

1 if it is an apposition container

articular

int

1 if the sentence, group, clause, phrase or wg has an article

before

str

this is XML attribute before

book

str

book name (full name)

bookshort

str

book name (abbreviated) from ref attribute in xml

case

str

grammatical case

chapter

int

chapter number, from ref attribute in xml

clausetype

str

clause type

cls

str

this is XML attribute cls

cltype

str

clause type

criticalsign

str

this is XML attribute criticalsign

crule

str

clause rule (from xml attribute Rule)

degree

str

grammatical degree

discontinuous

int

1 if the word is out of sequence in the xml

domain

str

domain

framespec

str

this is XML attribute framespec

function

str

this is XML attribute function

gender

str

grammatical gender

gloss

str

English gloss (BGVB)

id

str

xml id

junction

str

type of junction

lang

str

language the text is in

lemma

str

lexical lemma

lemmatranslit

str

transliteration of the word lemma

ln

str

ln

mood

str

verbal mood

morph

str

morphological code

nodeid

str

node id (as in the XML source data)

normalized

str

lemma normalized

note

str

annotation of linguistic nature

num

int

generated number (not in xml): book: (Matthew=1, Mark=2, ..., Revelation=27); sentence: numbered per chapter; word: numbered per verse.

number

str

grammatical number

otype

str

person

str

grammatical person

punctuation

str

punctuation found after a word

ref

str

biblical reference with word counting

referent

str

number of referent

rela

str

this is XML attribute rela

role

str

role

rule

str

syntactical rule

sp

str

part-of-speach

strong

int

strong number

subjrefspec

str

this is XML attribute subjrefspec

tense

str

verbal tense

text

str

the text of a word

trailer

str

material after the end of the word (excluding critical signs)

trans

str

translation of the word surface text according to the Berean Interlinear Bible

translit

str

transliteration of the word surface text

typ

str

syntactical type (on sentence, group, clause or phrase)

typems

str

morphological type (on word), syntactical type (on sentence, group, clause, phrase or wg)

unaccent

str

word in unicode characters without accents and diacritical markers

unicode

str

word in unicode characters plus material after it

variant

str

this is XML attribute variant

verse

int

verse number, from ref attribute in xml

voice

str

verbal voice

frame

str

frame

oslots

none

parent

none

parent relationship between words

sibling

int

this is XML attribute sibling

subjref

none

number of subject referent

Settings:

specified

apiVersion: 3
appName: CenterBLC/N1904
appPath: C:/Users/tonyj/text-fabric-data/github/CenterBLC/N1904/app
commit: gdb630837ae89b9468c9e50d13bda05cfd3de4f18
css: ''
dataDisplay:
- excludedFeatures: []
- noneValues:
  - none
  - unknown
  - no value
  - NA
- sectionSep1:
- sectionSep2: :
- textFormat: text-orig-full
docs:
- docBase: https://github.com/CenterBLC/N1904/tree/main/docs
- docPage: about
- docRoot: https://github.com/CenterBLC/N1904
- featureBase:
  https://github.com/CenterBLC/N1904/blob/main/docs/features/<feature>.md
- featurePage: README
interfaceDefaults: {fmt: text-orig-full}
isCompatible: True
local: local
localDir:
C:/Users/tonyj/text-fabric-data/github/CenterBLC/N1904/_temp
provenanceSpec:
- branch: main
- corpus: Nestle 1904 Greek New Testament
- doi: 10.5281/zenodo.13117910
- moduleSpecs: []
- org: CenterBLC
- relative: /tf
- repo: N1904
- repro: N1904
- version: 1.0.0
- webBase: https://learner.bible/text/show_text/nestle1904/
- webHint: Show this on the website
- webLang: en
- webUrl:
  https://learner.bible/text/show_text/nestle1904/<1>/<2>/<3>
- webUrlLex: {webBase}/word?version={version}&id=<lid>
release: 1.0.0
typeDisplay:
- clause:
  - condense: True
  - label: {typ} {function} {rela} \\ {cls} {role} {junction}
  - style: ''
- group:
  - label: {typ} {function} {rela} \\ {typems} {role} {rule}
  - style: ''
- phrase:
  - condense: True
  - label: {typ} {function} {rela} \\ {typems} {role} {rule}
  - style: ''
- sentence:
  - label: {typ} {function} {rela} \\ {role} {rule}
  - style: ''
- subphrase:
  - label: {typ} {function} {rela} \\ {typems} {role} {rule}
  - style: ''
- verse:
  - condense: True
  - label: {book} {chapter}:{verse}
  - style: ''
- wg:
  - condense: True
  - label: {typems} {role} {rule} {junction}
  - style: ''
- word:
  - features:
    lemma
    sp
  - featuresBare: [gloss]
writing: grc

TF API: names N F E L T S C TF Fs Fall Es Eall Cs Call directly usable

Display is setup for viewtype syntax-view

See here for more information on viewtypes

2.3 - Push CSS code to the Notebook ¶

The following code is optional. Its main function is to ensure the formatting of Text-Fabric objects, such as tables and syntax trees, is properly displayed in the online Notebook Viewer, matching the way it is shown in the Jupyter Notebook itself. It is using the getCss(app) function to collect the complete CSS code from the TF and the app.

In [10]:

# The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display with notebook viewer)
N1904.dh(N1904.getCss())

Note: this is achieved by embedding the CSS code inside the notebook file. The content of the CSS code can be examined from this cells output (truncated):

{
   "cell_type": "code",
   "execution_count": 7,
   "id": "932992c9-3fd9-4b5a-aa22-48eb376c8622",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style>tr.tf.ltr, td.tf.ltr, th.tf.ltr { text-align: left ! important;}\n",
       "tr.tf.rtl, td.tf.rtl, th.tf.rtl { text-align: right ! important;}\n",
       "@font-face {\n",
       "  font-family: \"Gentium Plus\";\n",

       ... etc ...

3 - Notebook version details ¶

Back to ToC ¶

Author	Tony Jurg
Version	1.1
Date	9 October 2024

Load the Text-Fabric dataset (N1904-TF)¶

Table of content (ToC)¶

1 - Introduction ¶

Back to ToC¶

1.1 - Text-Fabric data versions ¶

1.2 - Prerequisites / Installation¶

1.3 - Updates ¶

2 - Load Text-Fabric into memory ¶

Back to ToC¶

2.1 - Load the Text-Fabric code ¶

2.2 - Load Text-Fabric app and data ¶

2.3 - Push CSS code to the Notebook¶

3 - Notebook version details¶

Back to ToC¶

Back to ToC ¶

1.2 - Prerequisites / Installation ¶

Back to ToC ¶

2.3 - Push CSS code to the Notebook ¶

3 - Notebook version details ¶

Back to ToC ¶