A hapax legomenon (plural: hapax legomena) is a term used in linguistics and literary analysis to refer to a word or idiomatic expression that appears only once in a specific corpus. In the context of the Bible, hapax legomena are words that occur only once in the entire biblical text, posing unique challenges for translators and scholars. Since these words lack additional occurrences for context, their meanings can often remain obscure or debated.
The same may be true for some dis legomena, words that appear only twice within a corpus. While technically not a hapax legomenon, the Greek word ἐπιούσιος, which occurs only twice in the Lord’s Prayer (Matthew 6:11 and Luke 11:3), exemplifies a similar issue. Its precise meaning is uncertain, with interpretations ranging from "daily" to "necessary" to "supernatural." 1
Not all New Testament hapax legomena, however, are obscure or lack comparison material. For instance:
Names of people, places, or objects are often clear; for example, the name Δανιήλ (Daniel) has an unambiguous meaning.
Hapax legomena with Septuagint parallels: Certain New Testament hapax legomena can be cross-referenced with the Septuagint (LXX), which can provide additional insights.Take for example the use of the Greek verb ἱερουργέω ('to minister as a priest') in Romans 15:16. Zondervan Illustrated Bible Backgrounds Commentary provides the following comment:
“The priestly duty of proclaiming the gospel of God” renders a difficult Greek phrase, with “gospel” as the object of the verb “offer sacrifice.” But a parallel is found in 4 Maccabees 7:8, which speaks of “administrators of the law” (lit., “those who serve the law as priests”).2
While a list of hapax legomena can be computed with relative ease, its usefulness may be limited if presented without contextual or semantic information. In this notebook, we will categorize the New Testament hapax legomena by their semantic domains, as defined by Louw-Nida,4 to enhance interpretative value. This semantic data is available in feature ln.
%load_ext autoreload
%autoreload 2
# Loading the Text-Fabric code
# Note: it is assumed Text-Fabric is installed in your environment
from tf.fabric import Fabric
from tf.app import use
# Load the app and data
N1904 = use ("CenterBLC/N1904", version="1.0.0", hoist=globals())
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
book | 27 | 5102.93 | 100 |
chapter | 260 | 529.92 | 100 |
verse | 7944 | 17.34 | 100 |
sentence | 8011 | 17.20 | 100 |
group | 8945 | 7.01 | 46 |
clause | 42506 | 8.36 | 258 |
wg | 106868 | 6.88 | 533 |
phrase | 69007 | 1.90 | 95 |
subphrase | 116178 | 1.60 | 135 |
word | 137779 | 1.00 | 100 |
3
CenterBLC/N1904
C:/Users/tonyj/text-fabric-data/github/CenterBLC/N1904/app
gdb630837ae89b9468c9e50d13bda05cfd3de4f18
''
[]
none
unknown
NA
:
text-orig-full
https://github.com/CenterBLC/N1904/tree/main/docs
about
https://github.com/CenterBLC/N1904
https://github.com/CenterBLC/N1904/blob/main/docs/features/<feature>.md
README
text-orig-full
}True
local
C:/Users/tonyj/text-fabric-data/github/CenterBLC/N1904/_temp
main
Nestle 1904 Greek New Testament
10.5281/zenodo.13117910
[]
CenterBLC
/tf
N1904
N1904
1.0.0
https://learner.bible/text/show_text/nestle1904/
Show this on the website
en
https://learner.bible/text/show_text/nestle1904/<1>/<2>/<3>
{webBase}/word?version={version}&id=<lid>
1.0.0
True
{typ} {function} {rela} \\ {cls} {role} {junction}
''
{typ} {function} {rela} \\ {typems} {role} {rule}
''
True
{typ} {function} {rela} \\ {typems} {role} {rule}
''
{typ} {function} {rela} \\ {role} {rule}
''
{typ} {function} {rela} \\ {typems} {role} {rule}
''
True
{book} {chapter}:{verse}
''
True
{typems} {role} {rule} {junction}
''
lemma
sp
gloss
]grc
Display is setup for viewtype syntax-view
See here for more information on viewtypes
# The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display with notebook viewer)
N1904.dh(N1904.getCss())
The underlying principle of the script below is rather straightforward. However, the primary challenge lies in determining the feature to be employed in the identification of hapax legomena. The two most obvious options are:
normalized: This is basicly a 'cleaned up' version of the surface text. It does take into account forms where inflections of verbs and declensions of nouns are considered as separate words. The normalization is required to account for variations in accentuation.
lemma: here the base or root form of words, known as lemmas, serves as the basis for frequency calculations. When based upon feature "lemma", there are a few instances reported which refer to a specific sense associated with that lemma. For example, lemma="βάτος (II)" is only found once (in Luke 16:6), while lemma="βάτος (I)" is found five times.
In the N1904-TF dataset, we face an additional challenge due to its data design: the lemma feature is associated not only with word nodes but also with phrase and subphrase nodes. Therefore, the following approach does not work with this dataset:
FeatureFrequenceLists=Fs("lemma").freqList()
foundLemmata=0
for lemma, freq in FeatureFrequenceLists:
if freq==1:
print (lemma)
foundLemmata+=1
print (f'Found {foundLemmata} lemmata')
Found 0 lemmata
To build an effective script, we will iterate over all word nodes, retrieve each lemma, and create a dictionary to store this information. For each new lemma encountered, the dictionary will record the lemma itself, initialize a counter at 1, and store the node’s numeric value for its first occurrence (or, in the case of a hapax legomenon, its only instance). For subsequent occurrences of an already recorded lemma, the counter will increment accordingly.
# Initialize an empty dictionary to store lemma counts and first occurrences
lemmaInfoDict = {}
# Iterate over each word in the Greek New Testament
for word in F.otype.s("word"): # assuming F.otype.s("word") gives each word in the text
lemma = Fs("lemma").v(word) # get the lemma for the current word
# Check if the lemma is already in the dictionary
if lemma in lemmaInfoDict:
# Increment the count of the lemma
lemmaInfoDict[lemma] = (lemmaInfoDict[lemma][0] + 1, lemmaInfoDict[lemma][1])
else:
# If it's a new lemma, set the count to 1 and record the current word node as the first occurrence
lemmaInfoDict[lemma] = (1, word)
Note that this cell will not provide any output since it just creates the dictitonary lemmaInfoDict
which will be used in the next section.
The just created dictionary lemmaInfoDict
contains the frequency of all lemmata. To print only those with a count of 1, you can use the following script:
from IPython.display import display
import pandas as pd
# Filter lemmas with a count of 1 (hapax legomena) and retrieve section information
hapaxLegomena = {
lemma: T.sectionFromNode(firstOccurrence)
for lemma, (count, firstOccurrence) in lemmaInfoDict.items()
if count == 1
}
# Convert the dictionary to a DataFrame, splitting the tuple into separate columns
hapaxLegomenaDF = pd.DataFrame.from_dict(
hapaxLegomena,
orient='index',
columns=['Book', 'Chapter', 'Verse']
)
# Display the DataFrame as a table
display(hapaxLegomenaDF)
Book | Chapter | Verse | |
---|---|---|---|
Ζάρα | Matthew | 1 | 3 |
Θαμάρ | Matthew | 1 | 3 |
Ῥαχάβ | Matthew | 1 | 5 |
Ῥούθ | Matthew | 1 | 5 |
Οὐρίας | Matthew | 1 | 6 |
... | ... | ... | ... |
δωδέκατος | Revelation | 21 | 20 |
ἀμέθυστος | Revelation | 21 | 20 |
διαυγής | Revelation | 21 | 21 |
κατάθεμα | Revelation | 22 | 3 |
ῥυπαίνω | Revelation | 22 | 11 |
1927 rows × 3 columns
To map each lemma to its Louw-Nida top-level classification, we first need to create a mapping dictionary called louwNidaMapping
(section 3.3.1). Next, we can use the first part of the value from the ln feature as an index in the louwNidaMapping
table. This will "translate" the Louw-Nida top-level domain from a numeric value to a more descriptive one. This approach allows us to group all hapax legomena by their top-level Louw-Nida domain (the primary domain number before any subdomains). Later in the script, we add links to BibleOL to facilitate easy verification of each occurrence’s context.
# The following script will produce a dictionairy of Louw-Nida Top-level domains
# The structure of the dictionairy is:
# louwNidaMapping = {
# numeric (key) : "description"
# ...
# }
import requests
from bs4 import BeautifulSoup
# Debugging mode (False, True)
debug = False
# URL of the Louw-Nida classifications page
url = "https://www.laparola.net/greco/louwnida.php"
# Retrieve the webpage content
response = requests.get(url)
if debug:
print(f"Retrieving URL {url} returned {response}")
response.raise_for_status() # Check for request errors
# Parse the HTML content with BeautifulSoup
soup = BeautifulSoup(response.text, "html.parser")
# Initialize an empty dictionary
louwNidaMapping = {}
# Find all <h3> elements that contain the Louw-Nida classification data
for entry in soup.find_all("h3"):
# Extract the number from the <a> tag within the <h3> tag
numberTag = entry.find("a")
descriptionText = entry.get_text(strip=True)
# Ensure there's content to process
if numberTag and descriptionText:
# Attempt to parse the number and description
keyText = numberTag.get_text(strip=True)
try:
# Convert the number to an integer
key = int(keyText)
except ValueError:
# If conversion fails, skip this entry
if debug:
print(f"Skipping entry due to non-numeric key: {keyText}")
continue
# Get description by removing the number portion from the full text
description = descriptionText.replace(keyText, "", 1).strip(' "')
# Add to dictionary
louwNidaMapping[key] = description
if debug:
print(f"Added classification: {key}: {description}")
if debug:
print(f"Resulting dictionary: {louwNidaMapping}")
The following script will group the hapax legomena according to its Louw-Nida top-level clasification. Code has been added to allow for 'drilling down' by collapsing and expanding parts of the output.
from IPython.display import display, HTML # IPython is included as part of the standard environment in Jupyter notebooks
from collections import defaultdict
# Initialize dictionary for lemma occurrences and their first occurrence nodes
lemmaInfoDict = {}
# Iterate over each word in the Greek New Testament
for word in F.otype.s("word"): # fetch each word node in the text
lemma = F.lemma.v(word) # get the lemma associated with the current word
# Check if the lemma is already in lemmaInfoDict
if lemma in lemmaInfoDict:
# Increment the count of the lemma
lemmaInfoDict[lemma] = (lemmaInfoDict[lemma][0] + 1, lemmaInfoDict[lemma][1])
else:
# If lemma is new, set count to 1 and record the current word as the first occurrence
lemmaInfoDict[lemma] = (1, word)
# Initialize dictionary to store hapax legomena grouped by Louw-Nida classifications
hapaxLouwNidaTopLevelDict = defaultdict(list)
# Filter lemmas that are hapax legomena and group them by Louw-Nida domain
for lemma, (count, firstOccurrenceNode) in lemmaInfoDict.items():
if count == 1: # Only consider hapax legomena (occurrences = 1)
# Get Louw-Nida classification, replace None with "0"
louwNida = str(F.ln.v(firstOccurrenceNode) or "0")
book, chapter, verse = T.sectionFromNode(firstOccurrenceNode) # Get location details
topLevelLouwNidaClass = louwNida.split(".")[0] # Extract top-level domain or "0" if missing
# Add lemma with location details to the corresponding Louw-Nida classification group
hapaxLouwNidaTopLevelDict[topLevelLouwNidaClass].append((lemma, book, chapter, verse, firstOccurrenceNode))
# Calculate totals for each Louw-Nida top-level classification group
louwNidaTotals = {group: len(lemmas) for group, lemmas in hapaxLouwNidaTopLevelDict.items()}
# Sort totals by total count from high to low
sortedLouwNidaTotals = sorted(louwNidaTotals.items(), key=lambda x: x[1], reverse=True)
# Generate HTML output with collapsible sections for each Louw-Nida classification
hapaxHtmlOutput = "<html><body><h2>Hapax legomena by Louw-Nida top-level classification</h2>\n"
for group, total in sortedLouwNidaTotals:
# Display "Unknown" for group "0"
classification_label = "Unknown" if group == "0" else louwNidaMapping.get(int(group), 'Unknown Classification')
hapaxHtmlOutput += f"<details><summary>Louw-Nida Top-Level Domain {group}: {classification_label} (Total: {total})</summary><ul>\n"
for lemma, book, chapter, verse, node in hapaxLouwNidaTopLevelDict[group]:
hapaxHtmlOutput += f"<li><a href='https://learner.bible/text/show_text/nestle1904/{book}/{chapter}/{verse}' target='_blank'>{book} {chapter}:{verse}</a>:\t{lemma} ({F.gloss.v(node)})</li>\n"
hapaxHtmlOutput += "</ul></details>\n"
hapaxHtmlOutput += "</body></html>"
# Display the report in the Jupyter Notebook
display(HTML(hapaxHtmlOutput))
The following script creates a download link for the report generated in the previous cell.
from IPython.display import HTML
import base64 # used to encode the data to be downloaded
def createDownloadLink(htmlContent, fileName):
# Encode the HTML content to base64
b64Html = base64.b64encode(htmlContent.encode()).decode()
# Create the HTML download link
downloadLink = f'''
<a download="{fileName}" href="data:text/html;base64,{b64Html}">
<button>Download HTML File</button>
</a>
'''
return HTML(downloadLink)
# Display the download link in the notebook
createDownloadLink(hapaxHtmlOutput, 'hapax_to_louw_nida.html')
1 See the extensive discussion on ἐπιούσιος in: Brant Pitre, Jesus and the Jewish Roots of the Eucharist, Unlocking the Secrets of the Last Supper (New York: Doubleday, 2011), 93-96; pasim.
2 See Arnold E. Clinton (general editor), Zondervan Illustrated Bible Backgrounds Commentary, Volume Three, Romans to Philemon (Grand Rapids, MI: Zondervan Academic, 2002). Note that 4 Maccabees is included in LXX-Vaticanus.
3 See for example Kittel, ThWNT IV 285f. In EKK, Wolfgang Schrage cautions that, although numerous papyri link this word to a sacred monetary collection, this does not confirm that Paul's usage necessarily includes a cultic-sacral component. Wolfgang Schrage, Der erste Letter an die Korinther VII/4, Evangelisch Katholischer Komentar zum neuen Testament (Düsseldorf: Benziger, 2001), 425.
4 Johannes P. Louw and Eugene Albert Nida, Greek-English Lexicon of the New Testament: Based on Semantic Domains (New York: United Bible Societies, 1996).
5 The dictionary is based upon Louw-Nida Lexicon @ laparola.net.
The scripts in this notebook require (beside text-fabric
) the following Python libraries to be installed in the environment:
base64
collections
pandas
subprocess
You can install any missing library from within Jupyter Notebook using eitherpip
or pip3
.