#!/usr/bin/env python # coding: utf-8 # # Hapax legomena (N1904-TF) # ## Table of content (TOC) # * 1 - Introduction # * 2 - Load Text-Fabric app and data # * 3 - Performing the queries # * 3.1 - Determine the frequency of all words # * 3.2 - Print only the hapax legomana lemmata # * 3.3 - Matching the hapax legomana with their Louw-Nida clasification # * 3.3.1 - Defining the mapping dictionairy # * 3.3.2 - Group hapax legomena by Louw-Nida top-level classification # * 3.3.3 - Download as interactive HTML file # * 4 - Attribution and footnotes # * 5 - Required libraries # * 6 - Notebook and environment details # # 1 - Introduction # ##### [Back to TOC](#TOC) # # A *hapax legomenon* (plural: *hapax legomena*) is a term used in linguistics and literary analysis to refer to a word or idiomatic expression that appears only once in a specific corpus. In the context of the Bible, *hapax legomena* are words that occur only once in the entire biblical text, posing unique challenges for translators and scholars. Since these words lack additional occurrences for context, their meanings can often remain obscure or debated. # # The same may be true for some *dis legomena*, words that appear only twice within a corpus. While technically not a *hapax legomenon*, the Greek word ἐπιούσιος, which occurs only twice in the Lord’s Prayer ([Matthew 6:11](https://learner.bible/text/show_text/nestle1904/Matthew/6/11) and [Luke 11:3](https://learner.bible/text/show_text/nestle1904/Luke/11/3)), exemplifies a similar issue. Its precise meaning is uncertain, with interpretations ranging from "daily" to "necessary" to "supernatural." ¹ # # Not all New Testament *hapax legomena*, however, are obscure or lack comparison material. For instance: # # * Names of people, places, or objects are often clear; for example, the name Δανιήλ (Daniel) has an unambiguous meaning. # # * *Hapax legomena* with Septuagint parallels: Certain New Testament *hapax legomena* can be cross-referenced with the Septuagint (LXX), which can provide additional insights.Take for example the use of the Greek verb ἱερουργέω ('to minister as a priest') in [Romans 15:16](https://learner.bible/text/show_text/nestle1904/Romans/15/16). *Zondervan Illustrated Bible Backgrounds Commentary* provides the following comment: # > “The priestly duty of proclaiming the gospel of God” renders a difficult Greek phrase, with “gospel” as the object of the verb “offer sacrifice.” But a parallel is found in *4 Maccabees 7:8*, which speaks of “administrators of the law” (lit., “those who serve the law as priests”).² # # * In some cases, other contemporary extra-biblical corpora may also contain these unique words, providing further insights into their meanings within the New Testament. For example, the term λογείας in [1 Corinthians 16:1](https://learner.bible/text/show_text/nestle1904/I_Corinthians/16/1), a biblical *hapax legomenon*, is known from Greek ostraca found in Egypt and Nubia, suggesting that the collection in Corinth may also have had a cultic-sacral context. ³ # # While a list of *hapax legomena* can be computed with relative ease, its usefulness may be limited if presented without contextual or semantic information. In this notebook, we will categorize the New Testament *hapax legomena* by their semantic domains, as defined by Louw-Nida,⁴ to enhance interpretative value. This semantic data is available in feature [ln](https://centerblc.github.io/N1904/features/ln.html#start). # # 2 - Load Text-Fabric app and data # ##### [Back to TOC](#TOC) # In[1]: get_ipython().run_line_magic('load_ext', 'autoreload') get_ipython().run_line_magic('autoreload', '2') # In[2]: # Loading the Text-Fabric code # Note: it is assumed Text-Fabric is installed in your environment from tf.fabric import Fabric from tf.app import use # In[3]: # Load the app and data N1904 = use ("CenterBLC/N1904", version="1.0.0", hoist=globals()) # In[4]: # The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display with notebook viewer) N1904.dh(N1904.getCss()) # # 3 - Performing the queries # ##### [Back to TOC](#TOC) # ## 3.1 - Determine the frequency of all words # The underlying principle of the script below is rather straightforward. However, the primary challenge lies in determining the feature to be employed in the identification of hapax legomena. The two most obvious options are: # # * [normalized](https://centerblc.github.io/N1904/features/normalized.html#start): This is basicly a 'cleaned up' version of the surface text. It does take into account forms where inflections of verbs and declensions of nouns are considered as separate words. The normalization is required to account for variations in accentuation. # # * [lemma](https://centerblc.github.io/N1904/features/lemma.html#start#start): here the base or root form of words, known as lemmas, serves as the basis for frequency calculations. When based upon feature "lemma", there are a few instances reported which refer to a specific sense associated with that lemma. For example, lemma="βάτος (II)" is only found once (in [Luke 16:6](https://learner.bible/text/show_text/nestle1904/Luke/16/6)), while lemma="βάτος (I)" is found five times. # # In the N1904-TF dataset, we face an additional challenge due to its data design: the lemma feature is associated not only with word nodes but also with phrase and subphrase nodes. Therefore, the following approach does not work with this dataset: # In[5]: FeatureFrequenceLists=Fs("lemma").freqList() foundLemmata=0 for lemma, freq in FeatureFrequenceLists: if freq==1: print (lemma) foundLemmata+=1 print (f'Found {foundLemmata} lemmata') # To build an effective script, we will iterate over all word nodes, retrieve each lemma, and create a dictionary to store this information. For each new lemma encountered, the dictionary will record the lemma itself, initialize a counter at 1, and store the node’s numeric value for its first occurrence (or, in the case of a hapax legomenon, its only instance). For subsequent occurrences of an already recorded lemma, the counter will increment accordingly. # In[6]: # Initialize an empty dictionary to store lemma counts and first occurrences lemmaInfoDict = {} # Iterate over each word in the Greek New Testament for word in F.otype.s("word"): # assuming F.otype.s("word") gives each word in the text lemma = Fs("lemma").v(word) # get the lemma for the current word # Check if the lemma is already in the dictionary if lemma in lemmaInfoDict: # Increment the count of the lemma lemmaInfoDict[lemma] = (lemmaInfoDict[lemma][0] + 1, lemmaInfoDict[lemma][1]) else: # If it's a new lemma, set the count to 1 and record the current word node as the first occurrence lemmaInfoDict[lemma] = (1, word) # Note that this cell will not provide any output since it just creates the dictitonary `lemmaInfoDict` which will be used in the next section. # ## 3.2 - Print only the hapax legomana lemmata # The just created dictionary `lemmaInfoDict` contains the frequency of all lemmata. To print only those with a count of 1, you can use the following script: # In[7]: from IPython.display import display import pandas as pd # Filter lemmas with a count of 1 (hapax legomena) and retrieve section information hapaxLegomena = { lemma: T.sectionFromNode(firstOccurrence) for lemma, (count, firstOccurrence) in lemmaInfoDict.items() if count == 1 } # Convert the dictionary to a DataFrame, splitting the tuple into separate columns hapaxLegomenaDF = pd.DataFrame.from_dict( hapaxLegomena, orient='index', columns=['Book', 'Chapter', 'Verse'] ) # Display the DataFrame as a table display(hapaxLegomenaDF) # ## 3.3 - Matching the hapax legomana with their Louw-Nida clasification # # To map each lemma to its Louw-Nida top-level classification, we first need to create a mapping dictionary called `louwNidaMapping` ([section 3.3.1](#bullet3x3x1)). Next, we can use the first part of the value from the [ln](https://centerblc.github.io/N1904/features/ln.html#start) feature as an index in the `louwNidaMapping` table. This will "translate" the Louw-Nida top-level domain from a numeric value to a more descriptive one. This approach allows us to group all *hapax legomena* by their top-level Louw-Nida domain (the primary domain number before any subdomains). Later in the script, we add links to [BibleOL](https://learner.bible) to facilitate easy verification of each occurrence’s context. # ### 3.3.1 - Defining the mapping dictionairy # # The following cell builds a dictionairy of high-level Louw-Nida top-level classification. ⁵ # In[8]: # The following script will produce a dictionairy of Louw-Nida Top-level domains # The structure of the dictionairy is: # louwNidaMapping = { # numeric (key) : "description" # ... # } import requests from bs4 import BeautifulSoup # Debugging mode (False, True) debug = False # URL of the Louw-Nida classifications page url = "https://www.laparola.net/greco/louwnida.php" # Retrieve the webpage content response = requests.get(url) if debug: print(f"Retrieving URL {url} returned {response}") response.raise_for_status() # Check for request errors # Parse the HTML content with BeautifulSoup soup = BeautifulSoup(response.text, "html.parser") # Initialize an empty dictionary louwNidaMapping = {} # Find all

elements that contain the Louw-Nida classification data for entry in soup.find_all("h3"): # Extract the number from the tag within the

tag numberTag = entry.find("a") descriptionText = entry.get_text(strip=True) # Ensure there's content to process if numberTag and descriptionText: # Attempt to parse the number and description keyText = numberTag.get_text(strip=True) try: # Convert the number to an integer key = int(keyText) except ValueError: # If conversion fails, skip this entry if debug: print(f"Skipping entry due to non-numeric key: {keyText}") continue # Get description by removing the number portion from the full text description = descriptionText.replace(keyText, "", 1).strip(' "') # Add to dictionary louwNidaMapping[key] = description if debug: print(f"Added classification: {key}: {description}") if debug: print(f"Resulting dictionary: {louwNidaMapping}") # ### 3.3.2 - Group hapax legomena by Louw-Nida top-level classification # # The following script will group the hapax legomena according to its Louw-Nida top-level clasification. Code has been added to allow for 'drilling down' by collapsing and expanding parts of the output. # In[9]: from IPython.display import display, HTML # IPython is included as part of the standard environment in Jupyter notebooks from collections import defaultdict # Initialize dictionary for lemma occurrences and their first occurrence nodes lemmaInfoDict = {} # Iterate over each word in the Greek New Testament for word in F.otype.s("word"): # fetch each word node in the text lemma = F.lemma.v(word) # get the lemma associated with the current word # Check if the lemma is already in lemmaInfoDict if lemma in lemmaInfoDict: # Increment the count of the lemma lemmaInfoDict[lemma] = (lemmaInfoDict[lemma][0] + 1, lemmaInfoDict[lemma][1]) else: # If lemma is new, set count to 1 and record the current word as the first occurrence lemmaInfoDict[lemma] = (1, word) # Initialize dictionary to store hapax legomena grouped by Louw-Nida classifications hapaxLouwNidaTopLevelDict = defaultdict(list) # Filter lemmas that are hapax legomena and group them by Louw-Nida domain for lemma, (count, firstOccurrenceNode) in lemmaInfoDict.items(): if count == 1: # Only consider hapax legomena (occurrences = 1) # Get Louw-Nida classification, replace None with "0" louwNida = str(F.ln.v(firstOccurrenceNode) or "0") book, chapter, verse = T.sectionFromNode(firstOccurrenceNode) # Get location details topLevelLouwNidaClass = louwNida.split(".")[0] # Extract top-level domain or "0" if missing # Add lemma with location details to the corresponding Louw-Nida classification group hapaxLouwNidaTopLevelDict[topLevelLouwNidaClass].append((lemma, book, chapter, verse, firstOccurrenceNode)) # Calculate totals for each Louw-Nida top-level classification group louwNidaTotals = {group: len(lemmas) for group, lemmas in hapaxLouwNidaTopLevelDict.items()} # Sort totals by total count from high to low sortedLouwNidaTotals = sorted(louwNidaTotals.items(), key=lambda x: x[1], reverse=True) # Generate HTML output with collapsible sections for each Louw-Nida classification hapaxHtmlOutput = "

Hapax legomena by Louw-Nida top-level classification

\n" for group, total in sortedLouwNidaTotals: # Display "Unknown" for group "0" classification_label = "Unknown" if group == "0" else louwNidaMapping.get(int(group), 'Unknown Classification') hapaxHtmlOutput += f"

Louw-Nida Top-Level Domain {group}: {classification_label} (Total: {total})

{book} {chapter}:{verse}:\t{lemma} ({F.gloss.v(node)})

\n" hapaxHtmlOutput += "" # Display the report in the Jupyter Notebook display(HTML(hapaxHtmlOutput)) # ## 3.3.3 - Download as interactive HTML file # # The following script creates a download link for the report generated in the previous cell. # In[10]: from IPython.display import HTML import base64 # used to encode the data to be downloaded def createDownloadLink(htmlContent, fileName): # Encode the HTML content to base64 b64Html = base64.b64encode(htmlContent.encode()).decode() # Create the HTML download link downloadLink = f''' ''' return HTML(downloadLink) # Display the download link in the notebook createDownloadLink(hapaxHtmlOutput, 'hapax_to_louw_nida.html') # # 4 - Attribution and footnotes # ##### [Back to TOC](#TOC) # # #### Footnotes: # ¹ See the extensive discussion on ἐπιούσιος in: Brant Pitre, *Jesus and the Jewish Roots of the Eucharist, Unlocking the Secrets of the Last Supper* (New York: Doubleday, 2011), 93-96; *pasim*. # # ² See Arnold E. Clinton (general editor), *Zondervan Illustrated Bible Backgrounds Commentary, Volume Three, Romans to Philemon* (Grand Rapids, MI: Zondervan Academic, 2002). Note that 4 Maccabees is included in LXX-Vaticanus. # # ³ See for example Kittel, ThWNT IV 285f. In EKK, Wolfgang Schrage cautions that, although numerous papyri link this word to a sacred monetary collection, this does not confirm that Paul's usage necessarily includes a cultic-sacral component. Wolfgang Schrage, *Der erste Letter an die Korinther VII/4, Evangelisch Katholischer Komentar zum neuen Testament* (Düsseldorf: Benziger, 2001), 425. # # ⁴ Johannes P. Louw and Eugene Albert Nida, *Greek-English Lexicon of the New Testament: Based on Semantic Domains* (New York: United Bible Societies, 1996). # # ⁵ The dictionary is based upon [Louw-Nida Lexicon @ laparola.net](https://www.laparola.net/greco/louwnida.php). # # 5 - Required libraries # ##### [Back to TOC](#TOC) # # The scripts in this notebook require (beside `text-fabric`) the following Python libraries to be installed in the environment: # # base64 # collections # pandas # subprocess # # You can install any missing library from within Jupyter Notebook using either`pip` or `pip3`. # # 6 - Notebook and environment details # ##### [Back to TOC](#TOC) # #

# # # # # # # # # # # # # #

Author	Tony Jurg
Version	1.1
Date	17 November 2024

# The following cell displays the active Anaconda environment along with a list of all installed packages and their versions within that environment. # In[11]: import subprocess from IPython.display import display, HTML # Display the active conda environment get_ipython().system('conda env list | findstr "*"') # Run conda list and capture the output condaListOutput = subprocess.check_output("conda list", shell=True).decode("utf-8") # Wrap the output with

and

HTML tags htmlOutput = "

Click to view installed packages

"
htmlOutput += condaListOutput
htmlOutput += "

" # Display the HTML in the notebook display(HTML(htmlOutput))