A hapax legomenon (ἅπαξ λεγόμενον) is the term used in linguistics and philology to refer to a word or expression that appears only once within a specific context. Usually, this context is defined as the entire works of an author or a well-defined corpus of literature. The term comes from Greek, where "hapax" means "once" and "legomenon" means "something said." In this Notebook, the context to determine the hapax legomena is the full text of the Tenach, or more precisely, the full Biblica Hebraica Stuttgartensia.
The following code will load the Text-Fabric version of the Biblia Hebraica Stuttgartensia (Amstelodamensis).
%load_ext autoreload
%autoreload 2
# Loading the Text-Fabric code
# Note: it is assumed Text-Fabric is installed in your environment.
from tf.fabric import Fabric
from tf.app import use
# load the app and data
BHSA = use ("etcbc/BHSA", hoist=globals())
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
book | 39 | 10938.21 | 100 |
chapter | 929 | 459.19 | 100 |
lex | 9230 | 46.22 | 100 |
verse | 23213 | 18.38 | 100 |
half_verse | 45179 | 9.44 | 100 |
sentence | 63717 | 6.70 | 100 |
sentence_atom | 64514 | 6.61 | 100 |
clause | 88131 | 4.84 | 100 |
clause_atom | 90704 | 4.70 | 100 |
phrase | 253203 | 1.68 | 100 |
phrase_atom | 267532 | 1.59 | 100 |
subphrase | 113850 | 1.42 | 38 |
word | 426590 | 1.00 | 100 |
3
etcbc/BHSA
C:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/app
gd905e3fb6e80d0fa537600337614adc2af157309
''
<code>Genesis 1:1</code> (use <a href="https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf" target="_blank">English book names</a>)
g_uvf_utf8
g_vbs
kq_hybrid
languageISO
g_nme
lex0
is_root
g_vbs_utf8
g_uvf
dist
root
suffix_person
g_vbe
dist_unit
suffix_number
distributional_parent
kq_hybrid_utf8
crossrefSET
instruction
g_prs
lexeme_count
rank_occ
g_pfm_utf8
freq_occ
crossrefLCS
functional_parent
g_pfm
g_nme_utf8
g_vbe_utf8
kind
g_prs_utf8
suffix_gender
mother_object_type
none
unknown
NA
{docRoot}/{repo}
''
''
https://{org}.github.io
0_home
{}
True
local
C:/Users/tonyj/text-fabric-data/github/etcbc/BHSA/_temp
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
10.5281/zenodo.1007624
Phonetic Transcriptions
https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
10.5281/zenodo.1007636
etcbc
/tf
phono
Parallel Passages
https://nbviewer.jupyter.org/github/etcbc/parallels/blob/master/programs/parallels.ipynb
10.5281/zenodo.1007642
etcbc
/tf
parallels
etcbc
/tf
BHSA
2021
https://shebanq.ancient-data.org/hebrew
Show this on SHEBANQ
la
True
{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
{webBase}/word?version={version}&id=<lid>
v1.8
{typ} {rela}
''
True
{code}
1
''
True
{label}
''
True
gloss
{voc_lex_utf8}
word
orig
{voc_lex_utf8}
{typ} {function}
''
True
{typ} {rela}
1
''
{number}
''
True
{number}
1
''
True
{number}
''
pdp vs vt
lex:gloss
hbo
The Text-Fabric code in this Notebook is set up to query all words in the first and the last verse of this parasha. From these results (two lists of tuples), the boundaries (first and last word node) are determined. The value for the feature freq_lex is then examined for all word nodes within this range. Whenever the value for freq_lex is set to one, the related word and the verse it is part of are reported as a hapax legomenon. The indicated verse is hyperlinked to the STEP Bible, allowing for easy review of the verse.
# find first word node for this parasha
startQuery = '''
verse book=Deuteronomium chapter=31 verse=1
word
'''
startResults = BHSA.search(startQuery)
# get the value of the first node in this list of tuples
startNode=startResults[0][1]
0.13s 13 results
# find last word node for this parasha
endQuery = '''
verse book=Deuteronomium chapter=31 verse=30
word
'''
endResults = BHSA.search(endQuery)
# get the value of the last node in this list of tuples
endNode=endResults[-1][1]
0.09s 16 results
# following is to escape some values for gloss that are labeled as '<uncertain>'
def escape_markdown(text):
return text.replace("<", "<").replace(">", ">")
# now iterate over this range of nodes
numberOfHapax=0
# format the table using MarkDown
tableContent="Verse|Word|Gloss\n---|---|---\n"
for node in range(startNode,endNode):
freq=F.freq_lex.v(node)
if freq==1:
numberOfHapax+=1
sectionTuple=T.sectionFromNode(node)
linkSTEPbible=f"<a href=\"https://www.stepbible.org/?q=version=NASB2020\|reference={sectionTuple[0]}.{sectionTuple[1]}:{sectionTuple[2]}&options=HNVUG\" taget=\"_blank\">{sectionTuple[0]} {sectionTuple[1]}:{sectionTuple[2]}</a>"
tableContent+=f"{linkSTEPbible} | {F.g_word_utf8.v(node)}|{escape_markdown(F.gloss.v(node))}\n"
BHSA.dm(tableContent)
print(f"{numberOfHapax} hapaxes found")
Verse | Word | Gloss |
---|---|---|
0 hapaxes found
The scripts in this notebook require (beside text-fabric
) the following Python libraries to be installed in the environment:
{none}
You can install any missing library from within Jupyter Notebook using eitherpip
or pip3
.
An discussion regarding Hapax Legomena, including details about ten hapaxes in the Hebrew Bible can be found at The Torah.com.