This notebook gets you started with using Text-Fabric for coding in the Hebrew Bible.
If you are totally new to Text-Fabric, it might be helpful to read about the underlying data model first.
Short introductions to other TF datasets:
or the
In a notebook, you can perform searches and view them in a tabular display and zoom in on items with pretty displays.
But there are times that you want to take your results outside Text-Fabric, outside a notebook, outside Python, and just work with them in other programs, such as Excel.
You want to do that not only with query results, but with all kinds of lists of tuples of nodes.
There is a function for that, A.export()
, and here we show what it can do.
%load_ext autoreload
%autoreload 2
The ins and outs of installing Text-Fabric, getting the corpus, and initializing a notebook are explained in the start tutorial.
import os
from tf.app import use
A = use("ETCBC/bhsa", hoist=globals())
Locating corpus resources ...
Name | # of nodes | # slots/node | % coverage |
---|---|---|---|
book | 39 | 10938.21 | 100 |
chapter | 929 | 459.19 | 100 |
lex | 9230 | 46.22 | 100 |
verse | 23213 | 18.38 | 100 |
half_verse | 45179 | 9.44 | 100 |
sentence | 63717 | 6.70 | 100 |
sentence_atom | 64514 | 6.61 | 100 |
clause | 88131 | 4.84 | 100 |
clause_atom | 90704 | 4.70 | 100 |
phrase | 253203 | 1.68 | 100 |
phrase_atom | 267532 | 1.59 | 100 |
subphrase | 113850 | 1.42 | 38 |
word | 426590 | 1.00 | 100 |
3
ETCBC/bhsa
/Users/me/text-fabric-data/github/ETCBC/bhsa/app
gd905e3fb6e80d0fa537600337614adc2af157309
''
<code>Genesis 1:1</code> (use <a href="https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf" target="_blank">English book names</a>)
g_uvf_utf8
g_vbs
kq_hybrid
languageISO
g_nme
lex0
is_root
g_vbs_utf8
g_uvf
dist
root
suffix_person
g_vbe
dist_unit
suffix_number
distributional_parent
kq_hybrid_utf8
crossrefSET
instruction
g_prs
lexeme_count
rank_occ
g_pfm_utf8
freq_occ
crossrefLCS
functional_parent
g_pfm
g_nme_utf8
g_vbe_utf8
kind
g_prs_utf8
suffix_gender
mother_object_type
none
unknown
NA
{docRoot}/{repo}
''
''
https://{org}.github.io
0_home
{}
True
local
/Users/me/text-fabric-data/github/ETCBC/bhsa/_temp
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
10.5281/zenodo.1007624
Phonetic Transcriptions
https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
10.5281/zenodo.1007636
ETCBC
/tf
phono
Parallel Passages
https://nbviewer.jupyter.org/github/ETCBC/parallels/blob/master/programs/parallels.ipynb
10.5281/zenodo.1007642
ETCBC
/tf
parallels
ETCBC
/tf
bhsa
2021
https://shebanq.ancient-data.org/hebrew
Show this on SHEBANQ
la
True
{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
{webBase}/word?version={version}&id=<lid>
v1.8
{typ} {rela}
''
True
{code}
1
''
True
{label}
''
True
gloss
{voc_lex_utf8}
word
orig
{voc_lex_utf8}
{typ} {function}
''
True
{typ} {rela}
1
''
{number}
''
True
{number}
1
''
True
{number}
''
pdp vs vt
lex:gloss
hbo
We write a function that can peek into file on your system, and show the first few lines. We'll use it to inspect the exported files that we are going to produce.
EXPORT_FILE = os.path.expanduser("~/Downloads/results.tsv")
UPTO = 10
def checkout():
with open(EXPORT_FILE, encoding="utf_16") as fh:
for (i, line) in enumerate(fh):
if i >= UPTO:
break
print(line)
Our exported .tsv
files open in Excel without hassle, even if they contain non-latin characters.
That is because TF writes such files in an
encoding that works well with Excel: utf_16_le
.
You can just open them in Excel, there is no need for conversion before or after opening these files.
Should you want to process these files by means of a (Python) program,
take care to read them with encoding utf_16
.
We first run a query in order to export the results.
query = """
book book=Samuel_I
clause
word sp=nmpr
"""
results = A.search(query)
0.38s 1868 results
You can export the table of results to Excel.
The following command writes a tab-separated file results.tsv
to your downloads directory.
You can specify arguments toDir=directory
and toFile=file name
to write to a different file.
If the directory does not exist, it will be created.
We stick to the default, however.
A.export(results)
Check out the contents:
checkout()
R S1 S2 S3 NODE1 TYPE1 book1 NODE2 TYPE2 TEXT2 NODE3 TYPE3 TEXT3 sp3 1 1_Samuel 1 1 426598 book Samuel_I 453958 clause וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 141550 word אֶפְרָ֑יִם nmpr 2 1_Samuel 1 1 426598 book Samuel_I 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 141553 word אֶ֠לְקָנָה nmpr 3 1_Samuel 1 1 426598 book Samuel_I 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 141555 word יְרֹחָ֧ם nmpr 4 1_Samuel 1 1 426598 book Samuel_I 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 141557 word אֱלִיה֛וּא nmpr 5 1_Samuel 1 1 426598 book Samuel_I 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 141559 word תֹּ֥חוּ nmpr 6 1_Samuel 1 1 426598 book Samuel_I 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 141561 word צ֖וּף nmpr 7 1_Samuel 1 2 426598 book Samuel_I 453961 clause שֵׁ֤ם אַחַת֙ חַנָּ֔ה 141569 word חַנָּ֔ה nmpr 8 1_Samuel 1 2 426598 book Samuel_I 453962 clause וְשֵׁ֥ם הַשֵּׁנִ֖ית פְּנִנָּ֑ה 141574 word פְּנִנָּ֑ה nmpr 9 1_Samuel 1 2 426598 book Samuel_I 453964 clause לִפְנִנָּה֙ יְלָדִ֔ים 141578 word פְנִנָּה֙ nmpr
You see the following columns:
R
the sequence number of the result tuple in the result listS1 S2 S3
the section as book, chapter, verse, in separate columnsNODEi TYPEi
the node and its type, for each node i
in the result tupleTEXTi
the full text of node i
, if the node type admits a concise text representationsp3
the value of feature 3
, since our query mentions the feature sp
on node 3If we want to see the clause type (feature typ
) and the word gender (feature gn
) as well, we must mention them
in the query.
We can do so as follows:
query = """
book book=Samuel_I
clause typ*
word sp=nmpr gn*
"""
results = A.search(query)
0.67s 1868 results
The same number of results as before.
The *
is a trivial condition, it is always true.
We do the export again and peek at the results.
A.export(results)
checkout()
R S1 S2 S3 NODE1 TYPE1 book1 NODE2 TYPE2 TEXT2 typ2 NODE3 TYPE3 TEXT3 gn3 sp3 1 1_Samuel 1 1 426598 book Samuel_I 453958 clause וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם WayX 141550 word אֶפְרָ֑יִם unknown nmpr 2 1_Samuel 1 1 426598 book Samuel_I 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ NmCl 141553 word אֶ֠לְקָנָה m nmpr 3 1_Samuel 1 1 426598 book Samuel_I 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ NmCl 141555 word יְרֹחָ֧ם m nmpr 4 1_Samuel 1 1 426598 book Samuel_I 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ NmCl 141557 word אֱלִיה֛וּא m nmpr 5 1_Samuel 1 1 426598 book Samuel_I 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ NmCl 141559 word תֹּ֥חוּ m nmpr 6 1_Samuel 1 1 426598 book Samuel_I 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ NmCl 141561 word צ֖וּף m nmpr 7 1_Samuel 1 2 426598 book Samuel_I 453961 clause שֵׁ֤ם אַחַת֙ חַנָּ֔ה NmCl 141569 word חַנָּ֔ה f nmpr 8 1_Samuel 1 2 426598 book Samuel_I 453962 clause וְשֵׁ֥ם הַשֵּׁנִ֖ית פְּנִנָּ֑ה NmCl 141574 word פְּנִנָּ֑ה f nmpr 9 1_Samuel 1 2 426598 book Samuel_I 453964 clause לִפְנִנָּה֙ יְלָדִ֔ים NmCl 141578 word פְנִנָּה֙ f nmpr
As you see, you have an extra column typ2
and gn3
.
This gives you a lot of control over the generation of spreadsheets.
You can also export lists of node tuples that are not obtained by a query:
tuples = (
tuple(results[0][1:3]),
tuple(results[1][1:3]),
)
tuples
((453958, 141550), (453959, 141553))
Two rows, each row has a clause node and a word node.
Let's do a bare export:
A.export(tuples)
checkout()
R S1 S2 S3 NODE1 TYPE1 TEXT1 book1 NODE2 TYPE2 TEXT2 typ2 1 1_Samuel 1 1 453958 clause וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 141550 word אֶפְרָ֑יִם 2 1_Samuel 1 1 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 141553 word אֶ֠לְקָנָה
Wait a minute: why is the typ2
there?
It is because we have run a query before where we asked for typ
.
If we do not want to be influenced by previous things we've run, we need to reset the display:
A.displayReset("tupleFeatures")
Again:
A.export(tuples)
checkout()
R S1 S2 S3 NODE1 TYPE1 TEXT1 NODE2 TYPE2 TEXT2 1 1_Samuel 1 1 453958 clause וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם 141550 word אֶפְרָ֑יִם 2 1_Samuel 1 1 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ 141553 word אֶ֠לְקָנָה
We can get richer exports by means of
A.displaySetup()
, using the parameter tupleFeatures
:
A.displaySetup(
tupleFeatures=(
(0, "typ rela"),
(1, "sp gn nu pdp"),
)
)
We assign extra features per member of the tuple.
In the above case:
0
) member (the clause node), gets feature typ
;1
) member (the word node), gets features sp
and gn
.A.export(tuples)
checkout()
R S1 S2 S3 NODE1 TYPE1 TEXT1 typ1 rela1 NODE2 TYPE2 TEXT2 sp2 gn2 nu2 pdp2 1 1_Samuel 1 1 453958 clause וַיְהִי֩ אִ֨ישׁ אֶחָ֜ד מִן־הָרָמָתַ֛יִם צֹופִ֖ים מֵהַ֣ר אֶפְרָ֑יִם WayX NA 141550 word אֶפְרָ֑יִם nmpr unknown sg nmpr 2 1_Samuel 1 1 453959 clause וּשְׁמֹ֡ו אֶ֠לְקָנָה בֶּן־יְרֹחָ֧ם בֶּן־אֱלִיה֛וּא בֶּן־תֹּ֥חוּ בֶן־צ֖וּף אֶפְרָתִֽי׃ NmCl NA 141553 word אֶ֠לְקָנָה nmpr m sg nmpr
Talking about display setup: other parameters also have effect, e.g. the text format.
Let's change it to the phonetic representation.
A.export(tuples, fmt="text-phono-full")
checkout()
R S1 S2 S3 NODE1 TYPE1 TEXT1 typ1 rela1 NODE2 TYPE2 TEXT2 sp2 gn2 nu2 pdp2 1 1_Samuel 1 1 453958 clause wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim WayX NA 141550 word ʔefrˈāyim nmpr unknown sg nmpr 2 1_Samuel 1 1 453959 clause ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . NmCl NA 141553 word ʔelqānˌā nmpr m sg nmpr
You can chain queries like this:
results = (
A.search(
"""
book book=Samuel_I
chapter chapter=1
verse verse=1
clause
word sp=nmpr
"""
)
+ A.search(
"""
book book=Samuel_I
chapter chapter=1
verse verse=1
clause
word sp=verb nu=pl
"""
)
)
0.39s 6 results 0.40s 1 result
In such cases, it is better to setup the features yourself:
A.displaySetup(
tupleFeatures=(
(3, "typ rela"),
(4, "sp gn vt vs"),
),
fmt="text-phono-full",
)
Now we can do a fine export:
A.export(results)
checkout()
R S1 S2 S3 NODE1 TYPE1 NODE2 TYPE2 NODE3 TYPE3 TEXT3 NODE4 TYPE4 TEXT4 typ4 rela4 NODE5 TYPE5 TEXT5 sp5 gn5 vt5 vs5 1 1_Samuel 1 1 426598 book 426862 chapter 1421518 verse wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 453958 clause wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim WayX NA 141550 word ʔefrˈāyim nmpr unknown NA NA 2 1_Samuel 1 1 426598 book 426862 chapter 1421518 verse wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 453959 clause ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . NmCl NA 141553 word ʔelqānˌā nmpr m NA NA 3 1_Samuel 1 1 426598 book 426862 chapter 1421518 verse wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 453959 clause ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . NmCl NA 141555 word yᵊrōḥˈām nmpr m NA NA 4 1_Samuel 1 1 426598 book 426862 chapter 1421518 verse wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 453959 clause ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . NmCl NA 141557 word ʔᵉlîhˈû nmpr m NA NA 5 1_Samuel 1 1 426598 book 426862 chapter 1421518 verse wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 453959 clause ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . NmCl NA 141559 word tˌōḥû nmpr m NA NA 6 1_Samuel 1 1 426598 book 426862 chapter 1421518 verse wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 453959 clause ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . NmCl NA 141561 word ṣˌûf nmpr m NA NA 7 1_Samuel 1 1 426598 book 426862 chapter 1421518 verse wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim ûšᵊmˈô ʔelqānˌā ben-yᵊrōḥˈām ben-ʔᵉlîhˈû ben-tˌōḥû ven-ṣˌûf ʔefrāṯˈî . 453958 clause wayᵊhˌî ʔˌîš ʔeḥˈāḏ min-hārāmāṯˈayim ṣôfˌîm mēhˈar ʔefrˈāyim WayX NA 141547 word ṣôfˌîm verb m ptca qal
Now you now how to escape from Text-Fabric.
We hope that this makes your stay in TF more comfortable. It's not a Hotel California.
CC-BY Dirk Roorda