#!/usr/bin/env python # coding: utf-8 # # Hapax legomena (N1904-TF) # ## Table of content (TOC) # * 1 - Introduction # * 2 - Load Text-Fabric app and data # * 3 - Performing the queries # * 3.1 - Determine the frequency of all words # * 3.2 - Print only the hapax legomana lemmata # * 3.3 - Matching the hapax legomana with their Louw-Nida clasification # * 3.3.1 - Defining the mapping dictionairy # * 3.3.2 - Group hapax legomena by Louw-Nida top-level classification # * 3.3.3 - Download as interactive HTML file # * 4 - Attribution and footnotes # * 5 - Required libraries # * 6 - Notebook and environment details # # 1 - Introduction # ##### [Back to TOC](#TOC) # # A *hapax legomenon* (plural: *hapax legomena*) is a term used in linguistics and literary analysis to refer to a word or idiomatic expression that appears only once in a specific corpus. In the context of the Bible, *hapax legomena* are words that occur only once in the entire biblical text, posing unique challenges for translators and scholars. Since these words lack additional occurrences for context, their meanings can often remain obscure or debated. # # The same may be true for some *dis legomena*, words that appear only twice within a corpus. While technically not a *hapax legomenon*, the Greek word ἐπιούσιος, which occurs only twice in the Lord’s Prayer ([Matthew 6:11](https://learner.bible/text/show_text/nestle1904/Matthew/6/11) and [Luke 11:3](https://learner.bible/text/show_text/nestle1904/Luke/11/3)), exemplifies a similar issue. Its precise meaning is uncertain, with interpretations ranging from "daily" to "necessary" to "supernatural." 1 # # Not all New Testament *hapax legomena*, however, are obscure or lack comparison material. For instance: # # * Names of people, places, or objects are often clear; for example, the name Δανιήλ (Daniel) has an unambiguous meaning. # # * *Hapax legomena* with Septuagint parallels: Certain New Testament *hapax legomena* can be cross-referenced with the Septuagint (LXX), which can provide additional insights.Take for example the use of the Greek verb ἱερουργέω ('to minister as a priest') in [Romans 15:16](https://learner.bible/text/show_text/nestle1904/Romans/15/16). *Zondervan Illustrated Bible Backgrounds Commentary* provides the following comment: # > “The priestly duty of proclaiming the gospel of God” renders a difficult Greek phrase, with “gospel” as the object of the verb “offer sacrifice.” But a parallel is found in *4 Maccabees 7:8*, which speaks of “administrators of the law” (lit., “those who serve the law as priests”).2 # # * In some cases, other contemporary extra-biblical corpora may also contain these unique words, providing further insights into their meanings within the New Testament. For example, the term λογείας in [1 Corinthians 16:1](https://learner.bible/text/show_text/nestle1904/I_Corinthians/16/1), a biblical *hapax legomenon*, is known from Greek ostraca found in Egypt and Nubia, suggesting that the collection in Corinth may also have had a cultic-sacral context. 3 # # While a list of *hapax legomena* can be computed with relative ease, its usefulness may be limited if presented without contextual or semantic information. In this notebook, we will categorize the New Testament *hapax legomena* by their semantic domains, as defined by Louw-Nida,4 to enhance interpretative value. This semantic data is available in feature [ln](https://centerblc.github.io/N1904/features/ln.html#start). # # 2 - Load Text-Fabric app and data # ##### [Back to TOC](#TOC) # In[1]: get_ipython().run_line_magic('load_ext', 'autoreload') get_ipython().run_line_magic('autoreload', '2') # In[2]: # Loading the Text-Fabric code # Note: it is assumed Text-Fabric is installed in your environment from tf.fabric import Fabric from tf.app import use # In[3]: # Load the app and data N1904 = use ("CenterBLC/N1904", version="1.0.0", hoist=globals()) # In[4]: # The following will push the Text-Fabric stylesheet to this notebook (to facilitate proper display with notebook viewer) N1904.dh(N1904.getCss()) # # 3 - Performing the queries # ##### [Back to TOC](#TOC) # ## 3.1 - Determine the frequency of all words # The underlying principle of the script below is rather straightforward. However, the primary challenge lies in determining the feature to be employed in the identification of hapax legomena. The two most obvious options are: # # * [normalized](https://centerblc.github.io/N1904/features/normalized.html#start): This is basicly a 'cleaned up' version of the surface text. It does take into account forms where inflections of verbs and declensions of nouns are considered as separate words. The normalization is required to account for variations in accentuation. # # * [lemma](https://centerblc.github.io/N1904/features/lemma.html#start#start): here the base or root form of words, known as lemmas, serves as the basis for frequency calculations. When based upon feature "lemma", there are a few instances reported which refer to a specific sense associated with that lemma. For example, lemma="βάτος (II)" is only found once (in [Luke 16:6](https://learner.bible/text/show_text/nestle1904/Luke/16/6)), while lemma="βάτος (I)" is found five times. # # In the N1904-TF dataset, we face an additional challenge due to its data design: the lemma feature is associated not only with word nodes but also with phrase and subphrase nodes. Therefore, the following approach does not work with this dataset: # In[5]: FeatureFrequenceLists=Fs("lemma").freqList() foundLemmata=0 for lemma, freq in FeatureFrequenceLists: if freq==1: print (lemma) foundLemmata+=1 print (f'Found {foundLemmata} lemmata') # To build an effective script, we will iterate over all word nodes, retrieve each lemma, and create a dictionary to store this information. For each new lemma encountered, the dictionary will record the lemma itself, initialize a counter at 1, and store the node’s numeric value for its first occurrence (or, in the case of a hapax legomenon, its only instance). For subsequent occurrences of an already recorded lemma, the counter will increment accordingly. # In[6]: # Initialize an empty dictionary to store lemma counts and first occurrences lemmaInfoDict = {} # Iterate over each word in the Greek New Testament for word in F.otype.s("word"): # assuming F.otype.s("word") gives each word in the text lemma = Fs("lemma").v(word) # get the lemma for the current word # Check if the lemma is already in the dictionary if lemma in lemmaInfoDict: # Increment the count of the lemma lemmaInfoDict[lemma] = (lemmaInfoDict[lemma][0] + 1, lemmaInfoDict[lemma][1]) else: # If it's a new lemma, set the count to 1 and record the current word node as the first occurrence lemmaInfoDict[lemma] = (1, word) # Note that this cell will not provide any output since it just creates the dictitonary `lemmaInfoDict` which will be used in the next section. # ## 3.2 - Print only the hapax legomana lemmata # The just created dictionary `lemmaInfoDict` contains the frequency of all lemmata. To print only those with a count of 1, you can use the following script: # In[7]: from IPython.display import display import pandas as pd # Filter lemmas with a count of 1 (hapax legomena) and retrieve section information hapaxLegomena = { lemma: T.sectionFromNode(firstOccurrence) for lemma, (count, firstOccurrence) in lemmaInfoDict.items() if count == 1 } # Convert the dictionary to a DataFrame, splitting the tuple into separate columns hapaxLegomenaDF = pd.DataFrame.from_dict( hapaxLegomena, orient='index', columns=['Book', 'Chapter', 'Verse'] ) # Display the DataFrame as a table display(hapaxLegomenaDF) # ## 3.3 - Matching the hapax legomana with their Louw-Nida clasification # # To map each lemma to its Louw-Nida top-level classification, we first need to create a mapping dictionary called `louwNidaMapping` ([section 3.3.1](#bullet3x3x1)). Next, we can use the first part of the value from the [ln](https://centerblc.github.io/N1904/features/ln.html#start) feature as an index in the `louwNidaMapping` table. This will "translate" the Louw-Nida top-level domain from a numeric value to a more descriptive one. This approach allows us to group all *hapax legomena* by their top-level Louw-Nida domain (the primary domain number before any subdomains). Later in the script, we add links to [BibleOL](https://learner.bible) to facilitate easy verification of each occurrence’s context. # ### 3.3.1 - Defining the mapping dictionairy # # The following cell builds a dictionairy of high-level Louw-Nida top-level classification. 5 # In[8]: # The following script will produce a dictionairy of Louw-Nida Top-level domains # The structure of the dictionairy is: # louwNidaMapping = { # numeric (key) : "description" # ... # } import requests from bs4 import BeautifulSoup # Debugging mode (False, True) debug = False # URL of the Louw-Nida classifications page url = "https://www.laparola.net/greco/louwnida.php" # Retrieve the webpage content response = requests.get(url) if debug: print(f"Retrieving URL {url} returned {response}") response.raise_for_status() # Check for request errors # Parse the HTML content with BeautifulSoup soup = BeautifulSoup(response.text, "html.parser") # Initialize an empty dictionary louwNidaMapping = {} # Find all
Author | #Tony Jurg | #
Version | #1.1 | #
Date | #17 November 2024 | #
" htmlOutput += condaListOutput htmlOutput += "