Bas Meeuse has written an excellent primer for exegetes. In a series of increasingly difficult examples he shows how one can use template based search to explore syntactic patterns in the text.
The ETCBC database makes this possible, especially by its incarnation as BHSA data.
In that context, there are two ways to perform template based search:
Both MQL and TF-Query have been developed in close connection with the ETCBC, MQL by Ulrik Sandborg-Petersen, as part of his database system Emdros, and Text-Fabric by Dirk Roorda.
MQL is older, dating from 2005, and it is implemented in the programming language C. Text-Fabric dates from 2016 and is implemented in Python(3). This time difference shows not only in various aspects of the MQL and TF-Query syntax, but also in other aspects of usability.
The power of MQL and TF-Query are not comparable in that one is stronger than the other. Here are a few points.
point of comparison | MQL | TF-Query |
---|---|---|
Syntax | bracket based, quotes sometimes needed | indentation based, no quotes needed |
Kleene-star | yes | no |
Quantifiers | NOTEXIST |
/with/ /or/ /or/ and /where/ /have/ and /without/ |
Relational operators | adjacent (implicitly). .. UNORDEREDGROUP |
explicit choice from many spatial relationships |
Alternatives | OR |
no |
Usability | needs database back-end, is external C-program | no database needed, pure Python |
Integration | results need to be parsed | results directly usable in Python programs, result sets of queries can be building blocks of new queries |
Display | SHEBANQ does not display MQL results. Instead it displays all words that occur in results | In Jupyter notebooks and in the TF-browser results can be shown in various level of detail with various ways of highlighting |
Performance | not measured | measured, but not compared |
Here is how we weigh the differences:
Research questions usually go further than what can be expressed in queries, no matter how sophisticated the query language. Writing a query is just a first step, and usually more than one query is needed to gather in the evidence that can confirm or reject a hypothesis.
Queries tend to be difficult and error prone. Therefore we need checking and cross-checking. When we check a query by means of another query, we have to make sure that the other query is right as well. We need good visual access to query result, and we need the tools to combine and filter query results.
In my opinion, TF with TF-Query supports this complex practice significantly better than SHEBANQ with MQL. SHEBANQ is often a good start to get an idea, and it can also be used to advertise aspects of a research question after the work has been done, but it does not support all the intermediate steps. The big bonus of SHEBANQ is that it does not require local installation nor programming skills.
But when it comes to the nitty-gritty in the middle of research, there comes a moment to pay the price: if the computer is your tool, then you need to acquire the number-one skill to control information: programming.
The purpose of this primer is to help people to transition from MQL to TF-Query. To that end we reproduce the MQL queries in the SHEBANQ tutorial by Bas Meeuse by means of Text-Fabric queries, and discuss the subtle and not so subtle distinctions that we will encounter.
Dirk Roorda (dirk.roorda@dans.knaw.nl)
Copyright: No restrictions. Attribution appreciated. CC-BY.
We load the Text-Fabric program and the BHSA data.
%load_ext autoreload
%autoreload 2
from tf.app import use
from util import getTfVerses, getShebanqData, compareResults, MQL_RESULTS
VERSION = "2017"
# A = use('ETCBC/bhsa', hoist=globals(), version=VERSION)
A = use("ETCBC/bhsa:clone", checkout="clone", hoist=globals(), version=VERSION)
We will compare the results of MQL queries, performed by SHEBANQ with results obtained from TF-Queries.
To that end we have downloaded the CSV results of all example queries in the tutorial by Bas. We have written a convenience function to get the statistics of the SHEBANQ query: the number of results, verses, and words.
The number of results cannot be determined from the SHEBANQ data. I have taken it from the SHEBANQ interface and stored it. The number of verses and words are read from the exported files from SHEBANQ.
It may be possible that in the meanwhile queries have changed on SHEBANQ. This tutorial may not reflect that. It is based on what was in SHEBANQ around 2021-03-21.
Bas Meeuse: Example 1: Moses starts the speech
[clause FOCUS
[word lex = '>MR[' OR lex = 'DBR[']
[phrase function = Subj
[word lex = 'MCH=/']
]
..
[phrase function IN (Cmpl)
[word lex = 'JHWH/' OR lex = '>LHJM/']
]
]
(verses, words) = getShebanqData(A, MQL_RESULTS, 1)
8 results in 8 verses with 42 words
query = """
clause
word lex=>MR[|DBR[
<: phrase function=Subj
word lex=MCH=/
< phrase function=Cmpl
word lex=JHWH/|>LHJM/
"""
results = A.search(query)
1.61s 8 results
We need to do some effort to count the distinct verses and focused words in the results.
The convenience function getTfVerse()
takes care of this.
The words that SHEBANQ displays are the words that belong to the objects marked by the keyword FOCUS
.
In TF-Query, the results are tuples where each member corresponds to an object in the query, in that order.
So when we count the words from TF-Query results, we must specify which members of the tuple have FOCUS
.
We do that by giving the index numbers of the focused objects. (Index numbers start at 0).
(tfVerses, tfWords) = getTfVerses(A, results, (0,))
8 verses 42 words
The numbers agree exactly.
This is not a rigorous proof that the MQL results are identical to the TF results, but it is a very good indication.
However, we will also check that all words and verses in the shebanq results are exactly the words and verses in the TF results. We try to find the first difference.
compareResults(A, verses, words, tfVerses, tfWords)
VERSES EQUAL WORDS EQUAL
Finally, here are the results of the TF query.
A.table(results)
n | p | clause | word | phrase | word | phrase | word |
---|---|---|---|---|---|---|---|
1 | Exodus 3:11 | וַיֹּ֤אמֶר מֹשֶׁה֙ אֶל־הָ֣אֱלֹהִ֔ים | יֹּ֤אמֶר | מֹשֶׁה֙ | מֹשֶׁה֙ | אֶל־הָ֣אֱלֹהִ֔ים | אֱלֹהִ֔ים |
2 | Exodus 3:13 | וַיֹּ֨אמֶר מֹשֶׁ֜ה אֶל־הָֽאֱלֹהִ֗ים | יֹּ֨אמֶר | מֹשֶׁ֜ה | מֹשֶׁ֜ה | אֶל־הָֽאֱלֹהִ֗ים | אֱלֹהִ֗ים |
3 | Exodus 4:10 | וַיֹּ֨אמֶר מֹשֶׁ֣ה אֶל־יְהוָה֮ | יֹּ֨אמֶר | מֹשֶׁ֣ה | מֹשֶׁ֣ה | אֶל־יְהוָה֮ | יְהוָה֮ |
4 | Exodus 19:23 | וַיֹּ֤אמֶר מֹשֶׁה֙ אֶל־יְהוָ֔ה | יֹּ֤אמֶר | מֹשֶׁה֙ | מֹשֶׁה֙ | אֶל־יְהוָ֔ה | יְהוָ֔ה |
5 | Exodus 33:12 | וַיֹּ֨אמֶר מֹשֶׁ֜ה אֶל־יְהוָ֗ה | יֹּ֨אמֶר | מֹשֶׁ֜ה | מֹשֶׁ֜ה | אֶל־יְהוָ֗ה | יְהוָ֗ה |
6 | Numbers 11:11 | וַיֹּ֨אמֶר מֹשֶׁ֜ה אֶל־יְהוָ֗ה | יֹּ֨אמֶר | מֹשֶׁ֜ה | מֹשֶׁ֜ה | אֶל־יְהוָ֗ה | יְהוָ֗ה |
7 | Numbers 14:13 | וַיֹּ֥אמֶר מֹשֶׁ֖ה אֶל־יְהוָ֑ה | יֹּ֥אמֶר | מֹשֶׁ֖ה | מֹשֶׁ֖ה | אֶל־יְהוָ֑ה | יְהוָ֑ה |
8 | Numbers 27:15 | וַיְדַבֵּ֣ר מֹשֶׁ֔ה אֶל־יְהוָ֖ה | יְדַבֵּ֣ר | מֹשֶׁ֔ה | מֹשֶׁ֔ה | אֶל־יְהוָ֖ה | יְהוָ֖ה |
Bas Meeuse: Example 3: Whose people?
[phrase_atom FOCUS
[word AS P sp = prps]
..
[word lex = "W" OR lex = ">W"]
..
[word prs !~ "a" AND prs_ps = P.ps]
]
(verses, words) = getShebanqData(A, MQL_RESULTS, 3)
308 results in 150 verses with 685 words
query = """
phrase_atom
p:word sp=prps
< word lex=W|>W
< w:word prs~^[^a]*$
p .ps=prs_ps. w
"""
Note that in TF-Query we do not have the value !~
regex comparison, i.e.
value is not a match of regex.
We do have the value ~
regex comparison, so we have to find a regex that says:
"value does not contain a
".
That is the same as: every character in value is not an a
:
^[^a]*$
is a regex that matches:
a
See the Python re docs.
results = A.search(query)
1.37s 308 results
(tfVerses, tfWords) = getTfVerses(A, results, (0,))
150 verses 685 words
compareResults(A, verses, words, tfVerses, tfWords)
VERSES EQUAL WORDS EQUAL
A.table(results, end=5)
n | p | phrase_atom | word | word | word |
---|---|---|---|---|---|
1 | Genesis 6:18 | אַתָּ֕ה וּבָנֶ֛יךָ וְאִשְׁתְּךָ֥ וּנְשֵֽׁי־בָנֶ֖יךָ | אַתָּ֕ה | וּ | בָנֶ֛יךָ |
2 | Genesis 6:18 | אַתָּ֕ה וּבָנֶ֛יךָ וְאִשְׁתְּךָ֥ וּנְשֵֽׁי־בָנֶ֖יךָ | אַתָּ֕ה | וּ | אִשְׁתְּךָ֥ |
3 | Genesis 6:18 | אַתָּ֕ה וּבָנֶ֛יךָ וְאִשְׁתְּךָ֥ וּנְשֵֽׁי־בָנֶ֖יךָ | אַתָּ֕ה | וּ | בָנֶ֖יךָ |
4 | Genesis 6:18 | אַתָּ֕ה וּבָנֶ֛יךָ וְאִשְׁתְּךָ֥ וּנְשֵֽׁי־בָנֶ֖יךָ | אַתָּ֕ה | וְ | אִשְׁתְּךָ֥ |
5 | Genesis 6:18 | אַתָּ֕ה וּבָנֶ֛יךָ וְאִשְׁתְּךָ֥ וּנְשֵֽׁי־בָנֶ֖יךָ | אַתָּ֕ה | וְ | בָנֶ֖יךָ |
Willem van Peursen: Judges 5.1 (Sample query)
[clause
[phrase FOCUS function=Pred
[word sp=verb AND nu=sg AND gn=f]
]
..
[phrase FOCUS function=Subj
[word sp=conj]
]
]
(verses, words) = getShebanqData(A, MQL_RESULTS, 4)
65 results in 51 verses with 315 words
query = """
clause
phrase function=Pred
word sp=verb nu=sg gn=f
< phrase function=Subj
word sp=conj
"""
results = A.search(query)
1.36s 65 results
Note that the focus in the query is on both of the phrase objects. In the TF-Query results, the members of the result tuples that correspond to these phrases are at index 1 and 3 (counting from 0).
Let's show that. Here is the first result tuple:
results[0]
(429934, 658758, 12290, 658759, 12292)
These are nodes. Nodes are just numbers through which all information about objects kan be retrieved, like bar codes.
Here are the types of these nodes:
for (i, node) in enumerate(results[0]):
print(f"{i}: {F.otype.v(node)}")
0: clause 1: phrase 2: word 3: phrase 4: word
So the focused objects are 1 and 3, and this is what we pass to getTfVerses()
:
(tfVerses, tfWords) = getTfVerses(A, results, (1, 3))
51 verses 315 words
compareResults(A, verses, words, tfVerses, tfWords)
VERSES EQUAL WORDS EQUAL
A.table(results, end=5)
n | p | clause | phrase | word | phrase | word |
---|---|---|---|---|---|---|
1 | Genesis 24:61 | וַתָּ֨קָם רִבְקָ֜ה וְנַעֲרֹתֶ֗יהָ | תָּ֨קָם | תָּ֨קָם | רִבְקָ֜ה וְנַעֲרֹתֶ֗יהָ | וְ |
2 | Genesis 31:14 | וַתַּ֤עַן רָחֵל֙ וְלֵאָ֔ה | תַּ֤עַן | תַּ֤עַן | רָחֵל֙ וְלֵאָ֔ה | וְ |
3 | Genesis 33:7 | וַתִּגַּ֧שׁ גַּם־לֵאָ֛ה וִילָדֶ֖יהָ | תִּגַּ֧שׁ | תִּגַּ֧שׁ | גַּם־לֵאָ֛ה וִילָדֶ֖יהָ | וִ |
4 | Genesis 47:13 | וַתֵּ֜לַהּ אֶ֤רֶץ מִצְרַ֨יִם֙ וְאֶ֣רֶץ כְּנַ֔עַן מִפְּנֵ֖י הָרָעָֽב׃ | תֵּ֜לַהּ | תֵּ֜לַהּ | אֶ֤רֶץ מִצְרַ֨יִם֙ וְאֶ֣רֶץ כְּנַ֔עַן | וְ |
5 | Exodus 15:16 | תִּפֹּ֨ל עֲלֵיהֶ֤ם אֵימָ֨תָה֙ וָפַ֔חַד | תִּפֹּ֨ל | תִּפֹּ֨ל | אֵימָ֨תָה֙ וָפַ֔חַד | וָ |
Oliver Glanz: DHQ article: taking a woman
[clause
[phrase FOCUS function IN (Pred,PreC)
[word lex = "LQX["]
]
..
[phrase function = Objc
[word FOCUS gn = f AND nu=sg AND sp=nmpr]
]
]
(verses, words) = getShebanqData(A, MQL_RESULTS, 5)
23 results in 21 verses with 44 words
query = """
clause
phrase function=Pred|PreC
word lex=LQX[
< phrase function=Objc
word gn=f nu=sg sp=nmpr
"""
results = A.search(query)
1.32s 23 results
(tfVerses, tfWords) = getTfVerses(A, results, (2, 4))
21 verses 44 words
compareResults(A, verses, words, tfVerses, tfWords)
VERSES EQUAL WORDS EQUAL
A.table(results, end=5)
n | p | clause | phrase | word | phrase | word |
---|---|---|---|---|---|---|
1 | Genesis 11:31 | וַיִּקַּ֨ח תֶּ֜רַח אֶת־אַבְרָ֣ם בְּנֹ֗ו וְאֶת־לֹ֤וט בֶּן־הָרָן֙ בֶּן־בְּנֹ֔ו וְאֵת֙ שָׂרַ֣י כַּלָּתֹ֔ו אֵ֖שֶׁת אַבְרָ֣ם בְּנֹ֑ו | יִּקַּ֨ח | יִּקַּ֨ח | אֶת־אַבְרָ֣ם בְּנֹ֗ו וְאֶת־לֹ֤וט בֶּן־הָרָן֙ בֶּן־בְּנֹ֔ו וְאֵת֙ שָׂרַ֣י כַּלָּתֹ֔ו אֵ֖שֶׁת אַבְרָ֣ם בְּנֹ֑ו | שָׂרַ֣י |
2 | Genesis 12:5 | וַיִּקַּ֣ח אַבְרָם֩ אֶת־שָׂרַ֨י אִשְׁתֹּ֜ו וְאֶת־לֹ֣וט בֶּן־אָחִ֗יו וְאֶת־כָּל־רְכוּשָׁם֙ וְאֶת־הַנֶּ֖פֶשׁ | יִּקַּ֣ח | יִּקַּ֣ח | אֶת־שָׂרַ֨י אִשְׁתֹּ֜ו וְאֶת־לֹ֣וט בֶּן־אָחִ֗יו וְאֶת־כָּל־רְכוּשָׁם֙ וְאֶת־הַנֶּ֖פֶשׁ | שָׂרַ֨י |
3 | Genesis 16:3 | וַתִּקַּ֞ח שָׂרַ֣י אֵֽשֶׁת־אַבְרָ֗ם אֶת־הָגָ֤ר הַמִּצְרִית֙ שִׁפְחָתָ֔הּ מִקֵּץ֙ עֶ֣שֶׂר שָׁנִ֔ים | תִּקַּ֞ח | תִּקַּ֞ח | אֶת־הָגָ֤ר הַמִּצְרִית֙ שִׁפְחָתָ֔הּ | הָגָ֤ר |
4 | Genesis 20:2 | וַיִּקַּ֖ח אֶת־שָׂרָֽה׃ | יִּקַּ֖ח | יִּקַּ֖ח | אֶת־שָׂרָֽה׃ | שָׂרָֽה׃ |
5 | Genesis 24:61 | וַיִּקַּ֥ח הָעֶ֛בֶד אֶת־רִבְקָ֖ה | יִּקַּ֥ח | יִּקַּ֥ח | אֶת־רִבְקָ֖ה | רִבְקָ֖ה |
Bas Meeuse: Example 6: Turning point in Deut. 29,3?
[clause focus
[phrase function = Nega]
[phrase typ = VP
[word ps = p3]
]
[phrase function = Subj]
..
[phrase function = Time
[word first lex = "<D"]
]
]
(verses, words) = getShebanqData(A, MQL_RESULTS, 6)
10 results in 10 verses with 121 words
query = """
clause
phrase function=Nega
<: phrase typ=VP
word ps=p3
<: phrase function=Subj
< phrase function=Time
=: word lex=<D
"""
results = A.search(query)
1.78s 10 results
(tfVerses, tfWords) = getTfVerses(A, results, (0,))
10 verses 121 words
compareResults(A, verses, words, tfVerses, tfWords)
VERSES EQUAL WORDS EQUAL
A.table(results, end=5)
n | p | clause | phrase | phrase | word | phrase | phrase | word |
---|---|---|---|---|---|---|---|---|
1 | Genesis 32:33 | עַל־כֵּ֡ן לֹֽא־יֹאכְל֨וּ בְנֵֽי־יִשְׂרָאֵ֜ל אֶת־גִּ֣יד הַנָּשֶׁ֗ה עַ֖ד הַיֹּ֣ום הַזֶּ֑ה | לֹֽא־ | יֹאכְל֨וּ | יֹאכְל֨וּ | בְנֵֽי־יִשְׂרָאֵ֜ל | עַ֖ד הַיֹּ֣ום הַזֶּ֑ה | עַ֖ד |
2 | Exodus 10:6 | אֲשֶׁ֨ר לֹֽא־רָא֤וּ אֲבֹתֶ֨יךָ֙ וַאֲבֹו֣ת אֲבֹתֶ֔יךָ מִיֹּ֗ום עַ֖ד הַיֹּ֣ום הַזֶּ֑ה | לֹֽא־ | רָא֤וּ | רָא֤וּ | אֲבֹתֶ֨יךָ֙ וַאֲבֹו֣ת אֲבֹתֶ֔יךָ | עַ֖ד הַיֹּ֣ום הַזֶּ֑ה | עַ֖ד |
3 | Leviticus 19:13 | לֹֽא־תָלִ֞ין פְּעֻלַּ֥ת שָׂכִ֛יר אִתְּךָ֖ עַד־בֹּֽקֶר׃ | לֹֽא־ | תָלִ֞ין | תָלִ֞ין | פְּעֻלַּ֥ת שָׂכִ֛יר | עַד־בֹּֽקֶר׃ | עַד־ |
4 | Deuteronomy 29:3 | וְלֹֽא־נָתַן֩ יְהוָ֨ה לָכֶ֥ם לֵב֙ וְעֵינַ֥יִם וְאָזְנַ֣יִם עַ֖ד הַיֹּ֥ום הַזֶּֽה׃ | לֹֽא־ | נָתַן֩ | נָתַן֩ | יְהוָ֨ה | עַ֖ד הַיֹּ֥ום הַזֶּֽה׃ | עַ֖ד |
5 | Deuteronomy 34:6 | וְלֹֽא־יָדַ֥ע אִישׁ֙ אֶת־קְבֻ֣רָתֹ֔ו עַ֖ד הַיֹּ֥ום הַזֶּֽה׃ | לֹֽא־ | יָדַ֥ע | יָדַ֥ע | אִישׁ֙ | עַ֖ד הַיֹּ֥ום הַזֶּֽה׃ | עַ֖ד |
Reinoud Oosting: to withhold + from
[clause
[phrase function = Pred OR function = PreC
[word FOCUS sp = verb AND vs = qal AND lex = "MN<[" ]
]
..
[phrase function = Cmpl
[word FOCUS sp = prep ]
]
]
OR
[clause
[phrase function = Cmpl
[word FOCUS sp = prep ]
]
..
[phrase function = Pred OR function = PreC
[word FOCUS sp = verb AND vs = qal AND lex = "MN<["]
]
]
(verses, words) = getShebanqData(A, MQL_RESULTS, 8)
16 results in 16 verses with 32 words
Note, that as in example 7, the OR
is used to specify different orders of the same objects.
Since in TF-Query the order is not important, by default, we do not need the OR
, leading to a much simpler query.
query = """
clause
phrase function=Pred|PreC
word sp=verb vs=qal lex=MN<[
phrase function=Cmpl
word sp=prep
"""
results = A.search(query)
1.28s 16 results
(tfVerses, tfWords) = getTfVerses(A, results, (2, 4))
16 verses 32 words
compareResults(A, verses, words, tfVerses, tfWords)
VERSES EQUAL WORDS EQUAL
A.table(results, end=5)
n | p | clause | phrase | word | phrase | word |
---|---|---|---|---|---|---|
1 | Genesis 30:2 | אֲשֶׁר־מָנַ֥ע מִמֵּ֖ךְ פְּרִי־בָֽטֶן׃ | מָנַ֥ע | מָנַ֥ע | מִמֵּ֖ךְ | מִמֵּ֖ךְ |
2 | 1_Kings 20:7 | וְלֹ֥א מָנַ֖עְתִּי מִמֶּֽנּוּ׃ | מָנַ֖עְתִּי | מָנַ֖עְתִּי | מִמֶּֽנּוּ׃ | מִמֶּֽנּוּ׃ |
3 | Jeremiah 2:25 | מִנְעִ֤י רַגְלֵךְ֙ מִיָּחֵ֔ף | מִנְעִ֤י | מִנְעִ֤י | מִיָּחֵ֔ף | מִ |
4 | Jeremiah 5:25 | וְחַטֹּ֣אותֵיכֶ֔ם מָנְע֥וּ הַטֹּ֖וב מִכֶּֽם׃ | מָנְע֥וּ | מָנְע֥וּ | מִכֶּֽם׃ | מִכֶּֽם׃ |
5 | Jeremiah 31:16 | מִנְעִ֤י קֹולֵךְ֙ מִבֶּ֔כִי | מִנְעִ֤י | מִנְעִ֤י | מִבֶּ֔כִי | מִ |
This is a difficult case. We do it in a separate notebook: example 10a.
This is a case where we need two TF-queries. We do it in a separate notebook: example 10b.