Notebook

Lexical parallels in parasha #3: Lech Lecha (Genesis 12:1-17:27)¶

Table of Content (ToC)¶

1 - Introduction
2 - Load Text-Fabric app and data
3 - Performing the queries
- 3.1 - Locate the parallels
4 - Required libraries
5 - Further reading
6 - Notebook version details

1 - Introduction ¶

Back to ToC ¶

In this notebook we search for lexical parallels between verses in this parasha with other verses in the Tenach.

2 - Load Text-Fabric app and data ¶

Back to ToC ¶

The following code will load the Text-Fabric version of the Biblia Hebraica Stuttgartensia (Amstelodamensis).

In [1]:

%load_ext autoreload
%autoreload 2

In [2]:

# Loading the Text-Fabric code
# Note: it is assumed Text-Fabric is installed in your environment.
from tf.fabric import Fabric
from tf.app import use

In [3]:

# load the app and data
BHSA = use ("etcbc/BHSA", mod="tonyjurg/BHSaddons/tf/", hoist=globals())

Locating corpus resources ...

app: ~/text-fabric-data/github/etcbc/BHSA/app

data: ~/text-fabric-data/github/etcbc/BHSA/tf/2021

data: ~/text-fabric-data/github/tonyjurg/BHSaddons/tf/2021

data: ~/text-fabric-data/github/etcbc/phono/tf/2021

data: ~/text-fabric-data/github/etcbc/parallels/tf/2021

TF: TF API 12.6.1, etcbc/BHSA/app v3, Search Reference
Data: etcbc - BHSA 2021, Character table, Feature docs

Node types

Name	# of nodes	# slots / node	% coverage
book	39	10938.21	100
chapter	929	459.19	100
lex	9230	46.22	100
verse	23213	18.38	100
half_verse	45179	9.44	100
sentence	63717	6.70	100
sentence_atom	64514	6.61	100
clause	88131	4.84	100
clause_atom	90704	4.70	100
phrase	253203	1.68	100
phrase_atom	267532	1.59	100
subphrase	113850	1.42	38
word	426590	1.00	100

Sets: no custom sets
Features:

Parallel Passages

crossref

int

🆗 links between similar passages

BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis

book

str

✅ book name in Latin (Genesis; Numeri; Reges1; ...)

book@ll

str

✅ book name in amharic (ኣማርኛ)

chapter

int

✅ chapter number (1; 2; 3; ...)

code

int

✅ identifier of a clause atom relationship (0; 74; 367; ...)

det

str

✅ determinedness of phrase(atom) (det; und; NA.)

domain

str

✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)

freq_lex

int

✅ frequency of lexemes

function

str

✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)

g_cons

str

✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)

g_cons_utf8

str

✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)

g_lex

str

✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)

g_lex_utf8

str

✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)

g_word

str

✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)

g_word_utf8

str

✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)

gloss

str

🆗 english translation of lexeme (beginning create god(s))

str

✅ grammatical gender (m; f; NA; unknown.)

label

str

✅ (half-)verse label (half verses: A; B; C; verses: GEN 01,02)

language

str

✅ of word or lexeme (Hebrew; Aramaic.)

lex

str

✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)

lex_utf8

str

✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)

str

✅ lexical set, subclassification of part-of-speech (card; ques; mult)

nametype

str

⚠️ named entity type (pers; mens; gens; topo; ppde.)

nme

str

✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)

str

✅ grammatical number (sg; du; pl; NA; unknown.)

number

int

✅ sequence number of an object within its context

otype

str

pargr

str

🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)

pdp

str

✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)

pfm

str

✅ preformative consonantal-transliterated (absent; n/a; J, ...)

prs

str

✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)

prs_gn

str

✅ pronominal suffix gender (m; f; NA; unknown.)

prs_nu

str

✅ pronominal suffix number (sg; du; pl; NA; unknown.)

prs_ps

str

✅ pronominal suffix person (p1; p2; p3; NA; unknown.)

str

✅ grammatical person (p1; p2; p3; NA; unknown.)

qere

str

✅ word pointed-transliterated masoretic reading correction

qere_trailer

str

✅ interword material -pointed-transliterated (Masoretic correction)

qere_trailer_utf8

str

✅ interword material -pointed-transliterated (Masoretic correction)

qere_utf8

str

✅ word pointed-Hebrew masoretic reading correction

rank_lex

int

✅ ranking of lexemes based on freqnuecy

rela

str

✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)

str

✅ part-of-speech (art; verb; subs; nmpr, ...)

str

✅ state of a noun (a (absolute); c (construct); e (emphatic).)

tab

int

✅ clause atom: its level in the linguistic embedding

trailer

str

✅ interword material pointed-transliterated (& 00 05 00_P ...)

trailer_utf8

str

✅ interword material pointed-Hebrew (־ ׃)

txt

str

✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)

typ

str

✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)

uvf

str

✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)

vbe

str

✅ verbal ending consonantal-transliterated (n/a; W; ...)

vbs

str

✅ root formation consonantal-transliterated (absent; n/a; H; ...)

verse

int

✅ verse number

voc_lex

str

✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)

voc_lex_utf8

str

✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)

str

✅ verbal stem (qal; piel; hif; apel; pael)

str

✅ verbal tense (perf; impv; wayq; infc)

mother

none

✅ linguistic dependency between textual objects

oslots

none

Phonetic Transcriptions

phono

str

🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)

phono_trailer

str

🆗 interword material in phonological transcription

tonyjurg/BHSaddons/tf

aliyotnum

str

The sequence number of the aliyot within the parasha

maftir

str

Set to 1 if this verse is part of a maftir

parashahebr

str

The name of the parasha in Hebrew

parashanum

int

The sequence number of the parasha

parashatrans

str

Transliteration of the Hebrew parasha name

parashaverse

str

The sequence number of the verse within the parasha

wordboundary

str

indicates wordboudaries (spaces OR maqaf)

Settings:

specified

apiVersion: 3
appName: etcbc/BHSA
appPath: C:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/app
commit: gd905e3fb6e80d0fa537600337614adc2af157309
css: ''
dataDisplay:
- exampleSectionHtml:
  <code>Genesis 1:1</code> (use <a href="https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf" target="_blank">English book names</a>)
- excludedFeatures:
  - g_uvf_utf8
  - g_vbs
  - kq_hybrid
  - languageISO
  - g_nme
  - lex0
  - is_root
  - g_vbs_utf8
  - g_uvf
  - dist
  - root
  - suffix_person
  - g_vbe
  - dist_unit
  - suffix_number
  - distributional_parent
  - kq_hybrid_utf8
  - crossrefSET
  - instruction
  - g_prs
  - lexeme_count
  - rank_occ
  - g_pfm_utf8
  - freq_occ
  - crossrefLCS
  - functional_parent
  - g_pfm
  - g_nme_utf8
  - g_vbe_utf8
  - kind
  - g_prs_utf8
  - suffix_gender
  - mother_object_type
- noneValues:
  - none
  - unknown
  - no value
  - NA
docs:
- docBase: {docRoot}/{repo}
- docExt: ''
- docPage: ''
- docRoot: https://{org}.github.io
- featurePage: 0_home
interfaceDefaults: {}
isCompatible: True
local: local
localDir: C:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/_temp
provenanceSpec:
- corpus: BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
- doi: 10.5281/zenodo.1007624
- moduleSpecs:
  - :
    backend: no value
    corpus: Phonetic Transcriptions
    docUrl:
    https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
    doi: 10.5281/zenodo.1007636
    org: etcbc
    relative: /tf
    repo: phono
  - :
    backend: no value
    corpus: Parallel Passages
    docUrl:
    https://nbviewer.jupyter.org/github/etcbc/parallels/blob/master/programs/parallels.ipynb
    doi: 10.5281/zenodo.1007642
    org: etcbc
    relative: /tf
    repo: parallels
- org: etcbc
- relative: /tf
- repo: BHSA
- version: 2021
- webBase: https://shebanq.ancient-data.org/hebrew
- webHint: Show this on SHEBANQ
- webLang: la
- webLexId: True
- webUrl:
  {webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
- webUrlLex: {webBase}/word?version={version}&id=<lid>
release: v1.8
typeDisplay:
- clause:
  - label: {typ} {rela}
  - style: ''
- clause_atom:
  - hidden: True
  - label: {code}
  - level: 1
  - style: ''
- half_verse:
  - hidden: True
  - label: {label}
  - style: ''
  - verselike: True
- lex:
  - featuresBare: gloss
  - label: {voc_lex_utf8}
  - lexOcc: word
  - style: orig
  - template: {voc_lex_utf8}
- phrase:
  - label: {typ} {function}
  - style: ''
- phrase_atom:
  - hidden: True
  - label: {typ} {rela}
  - level: 1
  - style: ''
- sentence:
  - label: {number}
  - style: ''
- sentence_atom:
  - hidden: True
  - label: {number}
  - level: 1
  - style: ''
- subphrase:
  - hidden: True
  - label: {number}
  - style: ''
- word:
  - features: pdp vs vt
  - featuresBare: lex:gloss
writing: hbo

TF API: names N F E L T S C TF Fs Fall Es Eall Cs Call directly usable

3 - Performing the queries ¶

Back to ToC ¶

The main engine of our queries is the use of Text-Fabric feature crossref, part of Parallel Passages module. See also this notebook explaing the concepts and how this feature was created.

3.1 - Locate the parallels ¶

In [4]:

# find all verse nodes for this parasha using its sequence number
parashaQuery = '''
verse parashanum=3
'''
parashaResults = BHSA.search(parashaQuery)

  0.01s 126 results

In [5]:

# Store parashname, start and end verse for future use
startNode=parashaResults[0][0]
endNode=parashaResults[-1][0]
parashaNameHebrew=F.parashahebr.v(startNode)
parashaNameEnglish=F.parashatrans.v(startNode)
bookStart,chapterStart,startVerse=T.sectionFromNode(startNode)
parashaStart=f'{bookStart} {chapterStart}:{startVerse}'
bookEnd,chapterEnd,startEnd=T.sectionFromNode(endNode)
parashaEnd=f'{chapterEnd}:{startEnd}'

def wrapHTML(body, title):
    output = (
        f'<html><head><title>{title}</title></head>'
        f'<body>{body}<p>Data generated by `lexical_parallels.ipynb` at ' 
        '`<a href="https://github.com/tonyjurg/Parashot" target="_blank">'
        'github.com/tonyjurg/Parashot</a>`</p></body></html>'
    )
    return output

In [6]:

from difflib import SequenceMatcher
from IPython.display import HTML, display

# Function to find and highlight matching parts between two strings
def highlightMatches(baseText, comparisonText):
    matcher = SequenceMatcher(None, baseText, comparisonText)
    highlightedComparisonText = ""
    
    for tag, i1, i2, j1, j2 in matcher.get_opcodes():
        if tag == "equal":  # Identical parts
            highlightedComparisonText += f"<mark>{comparisonText[j1:j2]}</mark>"
        else:  # Non-matching parts
            highlightedComparisonText += comparisonText[j1:j2]
    
    return highlightedComparisonText

# Function to process cross-references and format them into an HTML table
def generateCrossReferencesTable(verseNode):
    """
    Generates an HTML table with cross-references for a single verse node, highlighting identical parts.
    The main verse text will be right-aligned.
    """
    # Get cross-references for the specified verseNode
    crossRefs = Es("crossref").f(verseNode)
    tableContent = ""
    
    # Check if there are any cross-references for this verse
    if crossRefs:
        verseSection = T.sectionFromNode(verseNode)
        mainVerseText = T.text(verseNode)
        linkStepBible = (
            f"<a href=\"https://www.stepbible.org/?q=version=NASB2020%7Creference={verseSection[0]}.{verseSection[1]}:{verseSection[2]}&options=HNVUG\" target=\"_blank\">"
            f"{verseSection[0]} {verseSection[1]}:{verseSection[2]}</a>"
        ) 
        # Right-align the main verse text
        tableContent += f"<h4>Cross-references for {linkStepBible}</h4>"
        tableContent += f"<div style='text-align: right; font-weight: bold;'>{mainVerseText}</div>"
        # Create table header
        tableContent += f"<table border='1' style='border-collapse: collapse; width: 100%;'><tr><th>Reference</th><th>Match</th><th>Text</th></tr>"
        
        # Process each cross-reference and add a row to the table
        for target, confidence in crossRefs:
            targetSection = T.sectionFromNode(target)
            targetText = T.text(target)
            
            targetStepBible = (
                f"<a href=\"https://www.stepbible.org/?q=version=NASB2020%7Creference={targetSection[0]}.{targetSection[1]}:{targetSection[2]}&options=HNVUG\" target=\"_blank\">"
                f"{targetSection[0]} {targetSection[1]}:{targetSection[2]}</a>"
            )     
            # Highlight identical parts in target verse
            highlightedText = highlightMatches(mainVerseText, targetText)
            
            # Add the row for the cross-reference
            tableContent += f"<tr><td>{targetStepBible}</td><td>{confidence}%</td><td>{highlightedText}</td></tr>"
        
        # Close the table
        tableContent += "</table><br>"

    return tableContent

# Initialize HTML content
reportTitle=f'Lexical parallels for parasha {parashaNameEnglish} ({parashaStart}-{parashaEnd})'
htmlContent = f"<h2>{reportTitle}</h2>"

# Process each verse and generate cross-reference tables
for verse in parashaResults:
    htmlContent += generateCrossReferencesTable(verse[0])

# Display the HTML content in the notebook
display(HTML(htmlContent))

# Define the HTML filename and store to file
fileName = f"lexical_parallels({parashaNameEnglish.replace(' ','_')}).html"
htmlContentFull = wrapHTML(htmlContent,reportTitle)
with open(fileName, "w", encoding="utf-8") as file:
    file.write(htmlContentFull)
    
# display download button
downloadButton = f"""
<a download="{fileName}" href="data:text/html;charset=utf-8,{htmlContentFull.replace('&', '&amp;').replace('<', '&lt;').replace('>', '&gt;').replace('"', '&quot;').replace("'", '&#39;')}" target="_blank">
    <button>Download as HTML</button>
</a>
"""
display(HTML(downloadButton))

Lexical parallels for parasha Lech Lecha (Genesis 12:1-17:27)

Cross-references for Genesis 15:20

וְאֶת־הַחִתִּ֥י וְאֶת־הַפְּרִזִּ֖י וְאֶת־הָרְפָאִֽים׃

Reference	Match	Text
Genesis 10:17	76%	וְאֶת־הַֽחִוִּ֥י וְאֶת־הַֽעַרְקִ֖י וְאֶת־הַסִּינִֽי׃
1_Chronicles 1:15	76%	וְאֶת־הַחִוִּ֥י וְאֶת־הַֽעַרְקִ֖י וְאֶת־הַסִּינִֽי׃

Cross-references for Genesis 15:21

וְאֶת־הָֽאֱמֹרִי֙ וְאֶת־הַֽכְּנַעֲנִ֔י וְאֶת־הַגִּרְגָּשִׁ֖י וְאֶת־הַיְבוּסִֽי׃ ס

Reference	Match	Text
Genesis 10:16	83%	וְאֶת־הַיְבוּסִי֙ וְאֶת־הָ֣אֱמֹרִ֔י וְאֵ֖ת הַגִּרְגָּשִֽׁי׃
1_Chronicles 1:14	83%	וְאֶת־הַיְבוּסִי֙ וְאֶת־הָ֣אֱמֹרִ֔י וְאֵ֖ת הַגִּרְגָּשִֽׁי׃

4 - Required libraries ¶

Back to ToC ¶

The scripts in this notebook require (beside text-fabric) the following Python libraries to be installed in the environment:

difflib
IPython

You can install any missing library from within Jupyter Notebook using eitherpip or pip3.

5 - Further reading ¶

Back to ToC ¶

For an elaborate treatment of parallel passages, see:

Willem Th. van Peursen and Eep Talstra. "Computer-Assisted Analysis of Parallel Texts in the Bible - The Case of 2 Kings xviii-xix and its Parallels in Isaiah and Chronicles" in Vetus Testamentum 57, pp. 45-72. 2007, Brill, Leiden.

6 - Notebook version details ¶

Back to ToC ¶

Author	Tony Jurg
Version	1.1
Date	18 November 2024

Lexical parallels in parasha #3: Lech Lecha (Genesis 12:1-17:27)¶

Table of Content (ToC)¶

1 - Introduction ¶

Back to ToC¶

2 - Load Text-Fabric app and data ¶

Back to ToC¶

3 - Performing the queries ¶

Back to ToC¶

3.1 - Locate the parallels ¶

Lexical parallels for parasha Lech Lecha (Genesis 12:1-17:27)

Cross-references for Genesis 15:20

Cross-references for Genesis 15:21

4 - Required libraries ¶

Back to ToC¶

5 - Further reading ¶

Back to ToC¶

6 - Notebook version details¶

Back to ToC¶

Back to ToC ¶

Back to ToC ¶

Back to ToC ¶

Back to ToC ¶

Back to ToC ¶

6 - Notebook version details ¶

Back to ToC ¶