Marshal as serialization of TF data

Marshal is a data serialization format used in the standard library of Python. It is more primitive, but it might be faster.

As a simple test, we take the feature data for g_word and oslots.

g_word is a map from the numbers 1 to ca. 420,000 to Hebrew word occurrences (ascii strings).

oslots is a map from ca. 1 million integers to tuples of integers.

In Text-Fabric we have a representation in plain text and a compressed, pickled representation.

We also run the deserialization in two ways: when the garbafe collector is enabled, or when the garabage collector is deliberately turned off.

Outcome

Pickle is faster. Loading gzipped, pickled data is much faster than loading gzipped, marshalled data.

The size of the marshal uncompressed serialization is much bigger than the TF text representation.

The size of the gzipped marshal serialization is approximately the same as the gzipped, pickled TF serialization.

Detailed comparison

what g_word oslots
pickle.gz with gc 0.08 0.7
pickle.gz without gc 0.09 0.38
marshal.gz with gc 1.11 1.86
marshal.gz without gc 1.07 1.85

Conclusion

We do not see reasons to replace the TF feature data serialization by marshal.

We do not fiddle with the garbage collector.

In [1]:
%load_ext autoreload
%autoreload 2
In [2]:
import os
import gzip
import marshal
import pickle
import gc
from shutil import move

from tf.fabric import Fabric
from tf.app import use

GZIP_LEVEL = 2 # same as used in Text-Fabric

Load from the textual data

In [3]:
BASE = os.path.expanduser("~/github/annotation/text-fabric")
TEST_BASE = f"{BASE}/_temp/serial"
TEST_DATA_TF = f"{TEST_BASE}/tf"
TEST_DATA_SERIAL = f"{TEST_BASE}/serialized"
FEATURES = ("g_word", "oslots")

if not os.path.exists(TEST_DATA_SERIAL):
    os.makedirs(TEST_DATA_SERIAL, exist_ok=True)
In [4]:
TF = Fabric(locations=TEST_DATA_TF)
api = TF.load(FEATURES)
This is Text-Fabric 9.1.11
Api reference : https://annotation.github.io/text-fabric/tf/cheatsheet.html

2 features found and 0 ignored
  0.00s Not all of the warp features otype and oslots are present in
~/github/annotation/text-fabric/_temp/serial/tf/
  0.00s Only the Feature and Edge APIs will be enabled
  0.00s Warp feature "otext" not found. Working without Text-API

   |       23s T oslots               from ~/github/annotation/text-fabric/_temp/serial/tf
   |     2.29s T g_word               from ~/github/annotation/text-fabric/_temp/serial/tf
    25s All features loaded/computed - for details use TF.isLoaded()

During this time, the textual data has been compiled and written to a binary form. We move the binary form (gz pickled) to the serial directory.

In [5]:
for fName in FEATURES:
    move(f"{TEST_DATA_TF}/.tf/2/{fName}.tfx", f"{TEST_DATA_SERIAL}/{fName}.pickle.gz")

Load gz-pickled

In [6]:
def load(fName, ext, withGc=True):
    TF.indent(reset=True)
    fullName = f"{fName}.{ext}"
    path = f"{TEST_DATA_SERIAL}/{fullName}"
    TF.info(f"start loading {fullName}")
    if not withGc:
        gc.disable()
    if ext == "pickle.gz":
        with gzip.open(path, "rb") as f:
            data = pickle.load(f)
    elif ext == "marshal.gz":
        with gzip.open(path, "rb") as f:
            data = marshal.load(f)
    TF.info(f"end loading {fName}.{ext}")
    if not withGc:
        gc.enable()
    return data
In [7]:
data = {}

for fName in FEATURES:
    data[fName] = load(fName, "pickle.gz")
    
for fName in FEATURES:
    data[fName] = load(fName, "pickle.gz", withGc=False)
  0.00s start loading g_word.pickle.gz
  0.09s end loading g_word.pickle.gz
  0.00s start loading oslots.pickle.gz
  0.70s end loading oslots.pickle.gz
  0.00s start loading g_word.pickle.gz
  0.09s end loading g_word.pickle.gz
  0.00s start loading oslots.pickle.gz
  0.43s end loading oslots.pickle.gz

Make an marshal feature data file

In [8]:
for fName in FEATURES:
    with open(f"{TEST_DATA_SERIAL}/{fName}.marshal.gz", 'wb') as mf:
        with gzip.open(f"{TEST_DATA_SERIAL}/{fName}.marshal.gz", "wb", compresslevel=GZIP_LEVEL) as f:
            marshal.dump(data[fName], f)

Load gz-marshal

In [9]:
dataMarshal = {}

for fName in FEATURES:
    dataMarshal[fName] = load(fName, "marshal.gz")

for fName in FEATURES:
    dataMarshal[fName] = load(fName, "marshal.gz", withGc=False)
  0.00s start loading g_word.marshal.gz
  1.16s end loading g_word.marshal.gz
  0.00s start loading oslots.marshal.gz
  1.92s end loading oslots.marshal.gz
  0.00s start loading g_word.marshal.gz
  1.07s end loading g_word.marshal.gz
  0.00s start loading oslots.marshal.gz
  1.87s end loading oslots.marshal.gz

With garbage collector turned off or on?

It seems that oslots loads much faster with the garbage collector temporarily switched off.

Let's try to load the whole BHSA in both ways:

In [17]:
TF.indent(reset=True)
TF.info("start loading bhsa with gc switched off")
A = use("bhsa", withGc=False)
TF.info("end loading bhsa with gc switched off")
  0.00s start loading bhsa with gc switched off
TF-app: ~/text-fabric-data/annotation/app-bhsa/code
data: ~/text-fabric-data/etcbc/bhsa/tf/2021
data: ~/text-fabric-data/etcbc/phono/tf/2021
data: ~/text-fabric-data/etcbc/parallels/tf/2021
This is Text-Fabric 9.1.11
Api reference : https://annotation.github.io/text-fabric/tf/cheatsheet.html

122 features found and 0 ignored
Text-Fabric: Text-Fabric API 9.1.11, app-bhsa v3, Search Reference
Data: BHSA, Character table, Feature docs
Features:
Parallel Passages
int
๐Ÿ†— links between similar passages
author:
BHSA Data: Constantijn Sikkel; Parallels Notebook: Dirk Roorda, Martijn Naaijer
coreData:
BHSA
dateWritten:
2021-12-09T14:40:46Z
provenance:
Parallels notebook, see https://github.com/ETCBC/parallels
version:
2021
writtenBy:
Text-Fabric
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
str
โœ… book name in Latin (Genesis; Numeri; Reges1; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:55Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… book name in amharic (แŠฃแˆ›แˆญแŠ›)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:20:27Z
email:
shebanq@ancient-data.org
encoders:
Dirk Roorda (TF)
language:
แŠฃแˆ›แˆญแŠ›
languageCode:
am
languageEnglish:
amharic
provenance:
book names from wikipedia and other sources
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… chapter number (1; 2; 3; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:55Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… identifier of a clause atom relationship (0; 74; 367; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:56Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
det
str
โœ… determinedness of phrase(atom) (det; und; NA.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:56Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:57Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… frequency of lexemes
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:24:45Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
computed on the basis of the ETCBC core set of features
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… syntactic function of phrase (Cmpl; Objc; Pred; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:57Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word consonantal-transliterated (B R>CJT BR> >LHJM ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:57Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word consonantal-Hebrew (ื‘ ืจืืฉืื™ืช ื‘ืจื ืืœื”ื™ื)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:58Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:58Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… lexeme pointed-Hebrew (ื‘ึฐึผ ืจึตืืฉึดืื™ืช ื‘ึธึผืจึธื ืึฑืœึนื”)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:59Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:04Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word pointed-Hebrew (ื‘ึผึฐ ืจึตืืฉืึดึ–ื™ืช ื‘ึผึธืจึธึฃื ืึฑืœึนื”ึดึ‘ื™ื)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:04Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
๐Ÿ†— english translation of lexeme (beginning create god(s))
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:13Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
gn
str
โœ… grammatical gender (m; f; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:05Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… (half-)verse label (half verses: A; B; C; verses: GEN 01,02)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:06Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… of word or lexeme (Hebrew; Aramaic.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:13Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
lex
str
โœ… lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:14Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… lexeme consonantal-Hebrew (ื‘ ืจืืฉืื™ืชึœ ื‘ืจื ืืœื”ื™ืึœ)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:15Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
ls
str
โœ… lexical set, subclassification of part-of-speech (card; ques; mult)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:15Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โš ๏ธ named entity type (pers; mens; gens; topo; ppde.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:15Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
nme
str
โœ… nominal ending consonantal-transliterated (absent; n/a; JM, ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:08Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
nu
str
โœ… grammatical number (sg; du; pl; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:08Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… sequence number of an object within its context
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:09Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:15Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
๐Ÿ†— hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:22:50Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional paragraph file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
pdp
str
โœ… phrase dependent part-of-speech (art; verb; subs; nmpr, ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:10Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
pfm
str
โœ… preformative consonantal-transliterated (absent; n/a; J, ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:11Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
prs
str
โœ… pronominal suffix consonantal-transliterated (absent; n/a; W; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:11Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… pronominal suffix gender (m; f; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:11Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… pronominal suffix number (sg; du; pl; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:12Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… pronominal suffix person (p1; p2; p3; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:12Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
ps
str
โœ… grammatical person (p1; p2; p3; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:12Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word pointed-transliterated masoretic reading correction
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:23:29Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional ketiv/qere file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… interword material -pointed-transliterated (Masoretic correction)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:23:29Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional ketiv/qere file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… interword material -pointed-transliterated (Masoretic correction)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:23:29Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional ketiv/qere file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word pointed-Hebrew masoretic reading correction
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:23:29Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional ketiv/qere file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… ranking of lexemes based on freqnuecy
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:24:46Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
computed on the basis of the ETCBC core set of features
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:13Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
sp
str
โœ… part-of-speech (art; verb; subs; nmpr, ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:16Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
st
str
โœ… state of a noun (a (absolute); c (construct); e (emphatic).)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:14Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
tab
int
โœ… clause atom: its level in the linguistic embedding
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:16Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… interword material pointed-transliterated (& 00 05 00_P ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:01Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… interword material pointed-Hebrew (ึพ ืƒ)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:01Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
txt
str
โœ… text type of clause and surrounding (repetion of ? N D Q as in feature domain)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:16Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
typ
str
โœ… clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:16Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
uvf
str
โœ… univalent final consonant consonantal-transliterated (absent; N; J; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:17Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
vbe
str
โœ… verbal ending consonantal-transliterated (n/a; W; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:17Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
vbs
str
โœ… root formation consonantal-transliterated (absent; n/a; H; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:17Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… verse number
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:18Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:16Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… vocalized lexeme pointed-Hebrew (ื‘ึผึฐ ืจึตืืฉืึดื™ืช ื‘ืจื ืึฑืœึนื”ึดื™ื)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:17Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
vs
str
โœ… verbal stem (qal; piel; hif; apel; pael)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:18Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
vt
str
โœ… verbal tense (perf; impv; wayq; infc)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:18Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
none
โœ… linguistic dependency between textual objects
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:22Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
none
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:17Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
Phonetic Transcriptions
str
๐Ÿ†— phonological transcription (bแตŠ rฤ“ลกหŒรฎแนฏ bฤrหˆฤ ส”แต‰lลhหˆรฎm)
author:
BHSA Data: Constantijn Sikkel; Phono Notebook: Dirk Roorda
coreData:
BHSA
dateWritten:
2021-12-09T14:25:55Z
provenance:
computed by the phono notebook, see https://github.com/ETCBC/phono
version:
2021
writtenBy:
Text-Fabric
str
๐Ÿ†— interword material in phonological transcription
author:
BHSA Data: Constantijn Sikkel; Phono Notebook: Dirk Roorda
coreData:
BHSA
dateWritten:
2021-12-09T14:25:55Z
provenance:
computed by the phono notebook, see https://github.com/ETCBC/phono
version:
2021
writtenBy:
Text-Fabric
  7.70s end loading bhsa with gc switched off
In [18]:
TF.indent(reset=True)
TF.info("start loading bhsa with gc switched on")
A = use("bhsa", withGc=True)
TF.info("end loading bhsa with gc switched on")
  0.00s start loading bhsa with gc switched on
TF-app: ~/text-fabric-data/annotation/app-bhsa/code
data: ~/text-fabric-data/etcbc/bhsa/tf/2021
data: ~/text-fabric-data/etcbc/phono/tf/2021
data: ~/text-fabric-data/etcbc/parallels/tf/2021
This is Text-Fabric 9.1.11
Api reference : https://annotation.github.io/text-fabric/tf/cheatsheet.html

122 features found and 0 ignored
Text-Fabric: Text-Fabric API 9.1.11, app-bhsa v3, Search Reference
Data: BHSA, Character table, Feature docs
Features:
Parallel Passages
int
๐Ÿ†— links between similar passages
author:
BHSA Data: Constantijn Sikkel; Parallels Notebook: Dirk Roorda, Martijn Naaijer
coreData:
BHSA
dateWritten:
2021-12-09T14:40:46Z
provenance:
Parallels notebook, see https://github.com/ETCBC/parallels
version:
2021
writtenBy:
Text-Fabric
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
str
โœ… book name in Latin (Genesis; Numeri; Reges1; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:55Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… book name in amharic (แŠฃแˆ›แˆญแŠ›)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:20:27Z
email:
shebanq@ancient-data.org
encoders:
Dirk Roorda (TF)
language:
แŠฃแˆ›แˆญแŠ›
languageCode:
am
languageEnglish:
amharic
provenance:
book names from wikipedia and other sources
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… chapter number (1; 2; 3; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:55Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… identifier of a clause atom relationship (0; 74; 367; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:56Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
det
str
โœ… determinedness of phrase(atom) (det; und; NA.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:56Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:57Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… frequency of lexemes
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:24:45Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
computed on the basis of the ETCBC core set of features
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… syntactic function of phrase (Cmpl; Objc; Pred; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:57Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word consonantal-transliterated (B R>CJT BR> >LHJM ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:57Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word consonantal-Hebrew (ื‘ ืจืืฉืื™ืช ื‘ืจื ืืœื”ื™ื)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:58Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:58Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… lexeme pointed-Hebrew (ื‘ึฐึผ ืจึตืืฉึดืื™ืช ื‘ึธึผืจึธื ืึฑืœึนื”)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:17:59Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:04Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word pointed-Hebrew (ื‘ึผึฐ ืจึตืืฉืึดึ–ื™ืช ื‘ึผึธืจึธึฃื ืึฑืœึนื”ึดึ‘ื™ื)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:04Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
๐Ÿ†— english translation of lexeme (beginning create god(s))
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:13Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
gn
str
โœ… grammatical gender (m; f; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:05Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… (half-)verse label (half verses: A; B; C; verses: GEN 01,02)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:06Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… of word or lexeme (Hebrew; Aramaic.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:13Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
lex
str
โœ… lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:14Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… lexeme consonantal-Hebrew (ื‘ ืจืืฉืื™ืชึœ ื‘ืจื ืืœื”ื™ืึœ)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:15Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
ls
str
โœ… lexical set, subclassification of part-of-speech (card; ques; mult)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:15Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โš ๏ธ named entity type (pers; mens; gens; topo; ppde.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:15Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
nme
str
โœ… nominal ending consonantal-transliterated (absent; n/a; JM, ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:08Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
nu
str
โœ… grammatical number (sg; du; pl; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:08Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… sequence number of an object within its context
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:09Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:15Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
๐Ÿ†— hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:22:50Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional paragraph file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
pdp
str
โœ… phrase dependent part-of-speech (art; verb; subs; nmpr, ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:10Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
pfm
str
โœ… preformative consonantal-transliterated (absent; n/a; J, ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:11Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
prs
str
โœ… pronominal suffix consonantal-transliterated (absent; n/a; W; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:11Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… pronominal suffix gender (m; f; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:11Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… pronominal suffix number (sg; du; pl; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:12Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… pronominal suffix person (p1; p2; p3; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:12Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
ps
str
โœ… grammatical person (p1; p2; p3; NA; unknown.)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:12Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word pointed-transliterated masoretic reading correction
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:23:29Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional ketiv/qere file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… interword material -pointed-transliterated (Masoretic correction)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:23:29Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional ketiv/qere file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… interword material -pointed-transliterated (Masoretic correction)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:23:29Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional ketiv/qere file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… word pointed-Hebrew masoretic reading correction
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:23:29Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional ketiv/qere file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… ranking of lexemes based on freqnuecy
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:24:46Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
computed on the basis of the ETCBC core set of features
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:13Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
sp
str
โœ… part-of-speech (art; verb; subs; nmpr, ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:16Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
st
str
โœ… state of a noun (a (absolute); c (construct); e (emphatic).)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:14Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
tab
int
โœ… clause atom: its level in the linguistic embedding
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:16Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… interword material pointed-transliterated (& 00 05 00_P ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:01Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… interword material pointed-Hebrew (ึพ ืƒ)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:01Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
txt
str
โœ… text type of clause and surrounding (repetion of ? N D Q as in feature domain)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:16Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
typ
str
โœ… clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:16Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
uvf
str
โœ… univalent final consonant consonantal-transliterated (absent; N; J; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:17Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
vbe
str
โœ… verbal ending consonantal-transliterated (n/a; W; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:17Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
vbs
str
โœ… root formation consonantal-transliterated (absent; n/a; H; ...)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:17Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
int
โœ… verse number
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:18Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:16Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
str
โœ… vocalized lexeme pointed-Hebrew (ื‘ึผึฐ ืจึตืืฉืึดื™ืช ื‘ืจื ืึฑืœึนื”ึดื™ื)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:17Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
provenance:
from additional lexicon file provided by the ETCBC
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
vs
str
โœ… verbal stem (qal; piel; hif; apel; pael)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:18Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
vt
str
โœ… verbal tense (perf; impv; wayq; infc)
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:18Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
none
โœ… linguistic dependency between textual objects
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:18:22Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
none
author:
Eep Talstra Centre for Bible and Computer
dataset:
BHSA
datasetName:
Biblia Hebraica Stuttgartensia Amstelodamensis
dateWritten:
2021-12-09T14:21:17Z
email:
shebanq@ancient-data.org
encoders:
Constantijn Sikkel (QDF), Ulrik Petersen (MQL) and Dirk Roorda (TF)
version:
2021
website:
https://shebanq.ancient-data.org
writtenBy:
Text-Fabric
Phonetic Transcriptions
str
๐Ÿ†— phonological transcription (bแตŠ rฤ“ลกหŒรฎแนฏ bฤrหˆฤ ส”แต‰lลhหˆรฎm)
author:
BHSA Data: Constantijn Sikkel; Phono Notebook: Dirk Roorda
coreData:
BHSA
dateWritten:
2021-12-09T14:25:55Z
provenance:
computed by the phono notebook, see https://github.com/ETCBC/phono
version:
2021
writtenBy:
Text-Fabric
str
๐Ÿ†— interword material in phonological transcription
author:
BHSA Data: Constantijn Sikkel; Phono Notebook: Dirk Roorda
coreData:
BHSA
dateWritten:
2021-12-09T14:25:55Z
provenance:
computed by the phono notebook, see https://github.com/ETCBC/phono
version:
2021
writtenBy:
Text-Fabric
  6.34s end loading bhsa with gc switched on

Does not make much difference. We leave the garbage collector untouched by default, i.e. we do not switch it off.

In [ ]: