To get started: consult start
We spot the similarities between ayas in the quran.
There are 6216 ayas in the Quran. To compare them all requires 20 million comparisons. That is a costly operation. On this laptop it took 30 whole seconds.
The good news it that we have stored the outcome in an extra feature.
This feature is packaged in a TF data module, that we will load below, by using the parameter mod
in the use()
statement.
%load_ext autoreload
%autoreload 2
import collections
from tf.app import use
A = use(
"q-ran/quran",
hoist=globals(),
mod="q-ran/quran/parallels/tf:clone",
)
# A = use('q-ran/quran', hoist=globals(), mod='q-ran/quran/parallels/tf')
This is Text-Fabric 9.2.3 Api reference : https://annotation.github.io/text-fabric/tf/cheatsheet.html 41 features found and 0 ignored
The new feature is sim and it it an edge feature. It annotates pairs of ayas $(l, m)$ where $l$ and $m$ have similar content. The degree of similarity is a percentage (between 60 and 100), and this value is annotated onto the edges.
Here is an example:
query = """
aya
<sim> aya
"""
results = A.search(query)
0.02s 2468 results
similars = [n for n in F.otype.s("aya") if E.sim.b(n)]
exampleAya = similars[0]
print(
f"{len(similars)} ayas with a similar aya, the first one has node number {exampleAya}"
)
812 ayas with a similar aya, the first one has node number 128221
sisters = E.sim.b(exampleAya)
print(f"{len(sisters)} similar ayas")
print("\n".join(f"{s[0]} with similarity {s[1]}" for s in sisters[0:10]))
A.table(tuple((s[0],) for s in sisters), end=10)
Let's first find out the range of similarities:
minSim = None
maxSim = None
for ln in F.otype.s("aya"):
sisters = E.sim.f(ln)
if not sisters:
continue
thisMin = min(s[1] for s in sisters)
thisMax = max(s[1] for s in sisters)
if minSim is None or thisMin < minSim:
minSim = thisMin
if maxSim is None or thisMax > maxSim:
maxSim = thisMax
print(f"minimum similarity is {minSim:>3}")
print(f"maximum similarity is {maxSim:>3}")
minimum similarity is 60 maximum similarity is 100
We give a few examples of the least similar lines.
N.B. When lines are less than 60% similar, they have not made it into the sim
feature!
We can use a search template to get the 60% lines.
query = """
aya
<sim=60> aya
"""
In words: find a line connected via a sim-edge with value 90 to an other line.
results = A.search(query)
0.01s 290 results
Not very much indeed. It seems that lines are either very similar, or not so similar at all.
A.table(results, start=1, end=10, withPassage="1 2")
n | aya | aya |
---|---|---|
1 | 2:39 وَٱلَّذِينَ كَفَرُوا۟ وَكَذَّبُوا۟ بِـَٔايَٰتِنَآ أُو۟لَٰٓئِكَ أَصْحَٰبُ ٱلنَّارِ هُمْ فِيهَا خَٰلِدُونَ | 5:10 وَٱلَّذِينَ كَفَرُوا۟ وَكَذَّبُوا۟ بِـَٔايَٰتِنَآ أُو۟لَٰٓئِكَ أَصْحَٰبُ ٱلْجَحِيمِ |
2 | 2:39 وَٱلَّذِينَ كَفَرُوا۟ وَكَذَّبُوا۟ بِـَٔايَٰتِنَآ أُو۟لَٰٓئِكَ أَصْحَٰبُ ٱلنَّارِ هُمْ فِيهَا خَٰلِدُونَ | 5:86 وَٱلَّذِينَ كَفَرُوا۟ وَكَذَّبُوا۟ بِـَٔايَٰتِنَآ أُو۟لَٰٓئِكَ أَصْحَٰبُ ٱلْجَحِيمِ |
3 | 2:107 أَلَمْ تَعْلَمْ أَنَّ ٱللَّهَ لَهُۥ مُلْكُ ٱلسَّمَٰوَٰتِ وَٱلْأَرْضِ وَمَا لَكُم مِّن دُونِ ٱللَّهِ مِن وَلِىٍّ وَلَا نَصِيرٍ | 29:22 وَمَآ أَنتُم بِمُعْجِزِينَ فِى ٱلْأَرْضِ وَلَا فِى ٱلسَّمَآءِ وَمَا لَكُم مِّن دُونِ ٱللَّهِ مِن وَلِىٍّ وَلَا نَصِيرٍ |
4 | 3:5 إِنَّ ٱللَّهَ لَا يَخْفَىٰ عَلَيْهِ شَىْءٌ فِى ٱلْأَرْضِ وَلَا فِى ٱلسَّمَآءِ | 14:38 رَبَّنَآ إِنَّكَ تَعْلَمُ مَا نُخْفِى وَمَا نُعْلِنُ وَمَا يَخْفَىٰ عَلَى ٱللَّهِ مِن شَىْءٍ فِى ٱلْأَرْضِ وَلَا فِى ٱلسَّمَآءِ |
5 | 3:74 يَخْتَصُّ بِرَحْمَتِهِۦ مَن يَشَآءُ وَٱللَّهُ ذُو ٱلْفَضْلِ ٱلْعَظِيمِ | 62:4 ذَٰلِكَ فَضْلُ ٱللَّهِ يُؤْتِيهِ مَن يَشَآءُ وَٱللَّهُ ذُو ٱلْفَضْلِ ٱلْعَظِيمِ |
6 | 4:138 بَشِّرِ ٱلْمُنَٰفِقِينَ بِأَنَّ لَهُمْ عَذَابًا أَلِيمًا | 15:50 وَأَنَّ عَذَابِى هُوَ ٱلْعَذَابُ ٱلْأَلِيمُ |
7 | 4:138 بَشِّرِ ٱلْمُنَٰفِقِينَ بِأَنَّ لَهُمْ عَذَابًا أَلِيمًا | 84:24 فَبَشِّرْهُم بِعَذَابٍ أَلِيمٍ |
8 | 5:10 وَٱلَّذِينَ كَفَرُوا۟ وَكَذَّبُوا۟ بِـَٔايَٰتِنَآ أُو۟لَٰٓئِكَ أَصْحَٰبُ ٱلْجَحِيمِ | 2:39 وَٱلَّذِينَ كَفَرُوا۟ وَكَذَّبُوا۟ بِـَٔايَٰتِنَآ أُو۟لَٰٓئِكَ أَصْحَٰبُ ٱلنَّارِ هُمْ فِيهَا خَٰلِدُونَ |
9 | 5:86 وَٱلَّذِينَ كَفَرُوا۟ وَكَذَّبُوا۟ بِـَٔايَٰتِنَآ أُو۟لَٰٓئِكَ أَصْحَٰبُ ٱلْجَحِيمِ | 2:39 وَٱلَّذِينَ كَفَرُوا۟ وَكَذَّبُوا۟ بِـَٔايَٰتِنَآ أُو۟لَٰٓئِكَ أَصْحَٰبُ ٱلنَّارِ هُمْ فِيهَا خَٰلِدُونَ |
10 | 6:30 وَلَوْ تَرَىٰٓ إِذْ وُقِفُوا۟ عَلَىٰ رَبِّهِمْ قَالَ أَلَيْسَ هَٰذَا بِٱلْحَقِّ قَالُوا۟ بَلَىٰ وَرَبِّنَا قَالَ فَذُوقُوا۟ ٱلْعَذَابَ بِمَا كُنتُمْ تَكْفُرُونَ | 46:34 وَيَوْمَ يُعْرَضُ ٱلَّذِينَ كَفَرُوا۟ عَلَى ٱلنَّارِ أَلَيْسَ هَٰذَا بِٱلْحَقِّ قَالُوا۟ بَلَىٰ وَرَبِّنَا قَالَ فَذُوقُوا۟ ٱلْعَذَابَ بِمَا كُنتُمْ تَكْفُرُونَ |
Or in lemma transcription:
A.table(results, start=1, end=10, fmt="lex-trans-full", withPassage="1 2")
n | aya | aya |
---|---|---|
1 | 2:39 {l~a*iY kafara ka*~aba 'aAyap >uwla`^}ik >aSoHa`b naAr fiY xa`lid | 5:10 {l~a*iY kafara ka*~aba 'aAyap >uwla`^}ik >aSoHa`b jaHiym |
2 | 2:39 {l~a*iY kafara ka*~aba 'aAyap >uwla`^}ik >aSoHa`b naAr fiY xa`lid | 5:86 {l~a*iY kafara ka*~aba 'aAyap >uwla`^}ik >aSoHa`b jaHiym |
3 | 2:107 lam Ealima >an~ {ll~ah mulok samaA^' >aroD maA min duwn {ll~ah min waliY~ laA naSiyr | 29:22 maA muEojiz fiY >aroD laA fiY samaA^' maA min duwn {ll~ah min waliY~ laA naSiyr |
4 | 3:5 <in~ {ll~ah laA yaxofaY` EalaY` $aYo' fiY >aroD laA fiY samaA^' | 14:38 rab~ <in~ Ealima maA >uxofiYa maA >aEolan maA yaxofaY` EalaY` {ll~ah min $aYo' fiY >aroD laA fiY samaA^' |
5 | 3:74 yaxotaS~u raHomap man $aA^'a {ll~ah *uw faDol EaZiym | 62:4 *a`lik faDol {ll~ah A^taY man $aA^'a {ll~ah *uw faDol EaZiym |
6 | 4:138 bu$~ira muna`fiquwn >an~ Ea*aAb >aliym | 15:50 >an~ Ea*aAb Ea*aAb >aliym |
7 | 4:138 bu$~ira muna`fiquwn >an~ Ea*aAb >aliym | 84:24 bu$~ira Ea*aAb >aliym |
8 | 5:10 {l~a*iY kafara ka*~aba 'aAyap >uwla`^}ik >aSoHa`b jaHiym | 2:39 {l~a*iY kafara ka*~aba 'aAyap >uwla`^}ik >aSoHa`b naAr fiY xa`lid |
9 | 5:86 {l~a*iY kafara ka*~aba 'aAyap >uwla`^}ik >aSoHa`b jaHiym | 2:39 {l~a*iY kafara ka*~aba 'aAyap >uwla`^}ik >aSoHa`b naAr fiY xa`lid |
10 | 6:30 law ra'aA <i* wuqifu EalaY` rab~ qaAla l~ayosa ha`*aA Haq~ qaAla balaY` rab~ qaAla *aAqu Ea*aAb maA kaAna kafara | 46:34 yawom EaraDa {l~a*iY kafara EalaY` naAr l~ayosa ha`*aA Haq~ qaAla balaY` rab~ qaAla *aAqu Ea*aAb maA kaAna kafara |
From now on we forget about the level of similarity, and focus on whether two lines are just "similar", meaning that they have a high degree of similarity.
Before we try to find them, let's see if we can cluster the lines in similar clusters.
CLUSTER_THRESHOLD = 0.5
def makeClusters():
A.indent(reset=True)
chunkSize = 1000
b = 0
j = 0
clusters = []
for ln in F.otype.s("aya"):
j += 1
b += 1
if b == chunkSize:
b = 0
A.info(f"{j:>5} ayas and {len(clusters):>5} clusters")
lSisters = {x[0] for x in E.sim.b(ln)}
lAdded = False
for cl in clusters:
if len(cl & lSisters) > CLUSTER_THRESHOLD * len(cl):
cl.add(ln)
lAdded = True
break
if not lAdded:
clusters.append({ln})
A.info(f"{j:>5} ayas and {len(clusters):>5} clusters")
return clusters
clusters = makeClusters()
0.10s 1000 ayas and 985 clusters 0.34s 2000 ayas and 1956 clusters 0.76s 3000 ayas and 2903 clusters 1.31s 4000 ayas and 3777 clusters 2.03s 5000 ayas and 4673 clusters 2.86s 6000 ayas and 5564 clusters 3.09s 6236 ayas and 5777 clusters
What is the distribution of the clusters, in terms of how many similar lines they contain? We count them.
clusterSizes = collections.Counter()
for cl in clusters:
clusterSizes[len(cl)] += 1
for (size, amount) in sorted(
clusterSizes.items(),
key=lambda x: (-x[0], x[1]),
):
print(f"clusters of size {size:>4}: {amount:>5}")
clusters of size 32: 1 clusters of size 12: 1 clusters of size 9: 1 clusters of size 8: 2 clusters of size 7: 2 clusters of size 6: 2 clusters of size 5: 3 clusters of size 4: 14 clusters of size 3: 24 clusters of size 2: 271 clusters of size 1: 5456
Let's investigate some interesting groups, that lie in some sweet spots.
There are a few ayas that occur in a moderately sized cluster. Lets print those clusters:
mediumClusters = []
for cluster in clusters:
nAyas = len(cluster)
if 4 < nAyas < 10:
mediumClusters.append(cluster)
for cluster in sorted(
mediumClusters,
key=lambda x: -len(x),
):
print(f"Cluster with {len(cluster)} ayas")
A.table([(a,) for a in sorted(cluster)])
Cluster with 9 ayas
n | p | aya |
---|---|---|
1 | 26:108 | فَٱتَّقُوا۟ ٱللَّهَ وَأَطِيعُونِ |
2 | 26:110 | فَٱتَّقُوا۟ ٱللَّهَ وَأَطِيعُونِ |
3 | 26:126 | فَٱتَّقُوا۟ ٱللَّهَ وَأَطِيعُونِ |
4 | 26:131 | فَٱتَّقُوا۟ ٱللَّهَ وَأَطِيعُونِ |
5 | 26:144 | فَٱتَّقُوا۟ ٱللَّهَ وَأَطِيعُونِ |
6 | 26:150 | فَٱتَّقُوا۟ ٱللَّهَ وَأَطِيعُونِ |
7 | 26:163 | فَٱتَّقُوا۟ ٱللَّهَ وَأَطِيعُونِ |
8 | 26:179 | فَٱتَّقُوا۟ ٱللَّهَ وَأَطِيعُونِ |
9 | 71:3 | أَنِ ٱعْبُدُوا۟ ٱللَّهَ وَٱتَّقُوهُ وَأَطِيعُونِ |
Cluster with 8 ayas
n | p | aya |
---|---|---|
1 | 26:8 | إِنَّ فِى ذَٰلِكَ لَءَايَةً وَمَا كَانَ أَكْثَرُهُم مُّؤْمِنِينَ |
2 | 26:67 | إِنَّ فِى ذَٰلِكَ لَءَايَةً وَمَا كَانَ أَكْثَرُهُم مُّؤْمِنِينَ |
3 | 26:103 | إِنَّ فِى ذَٰلِكَ لَءَايَةً وَمَا كَانَ أَكْثَرُهُم مُّؤْمِنِينَ |
4 | 26:121 | إِنَّ فِى ذَٰلِكَ لَءَايَةً وَمَا كَانَ أَكْثَرُهُم مُّؤْمِنِينَ |
5 | 26:139 | فَكَذَّبُوهُ فَأَهْلَكْنَٰهُمْ إِنَّ فِى ذَٰلِكَ لَءَايَةً وَمَا كَانَ أَكْثَرُهُم مُّؤْمِنِينَ |
6 | 26:158 | فَأَخَذَهُمُ ٱلْعَذَابُ إِنَّ فِى ذَٰلِكَ لَءَايَةً وَمَا كَانَ أَكْثَرُهُم مُّؤْمِنِينَ |
7 | 26:174 | إِنَّ فِى ذَٰلِكَ لَءَايَةً وَمَا كَانَ أَكْثَرُهُم مُّؤْمِنِينَ |
8 | 26:190 | إِنَّ فِى ذَٰلِكَ لَءَايَةً وَمَا كَانَ أَكْثَرُهُم مُّؤْمِنِينَ |
Cluster with 8 ayas
n | p | aya |
---|---|---|
1 | 26:9 | وَإِنَّ رَبَّكَ لَهُوَ ٱلْعَزِيزُ ٱلرَّحِيمُ |
2 | 26:68 | وَإِنَّ رَبَّكَ لَهُوَ ٱلْعَزِيزُ ٱلرَّحِيمُ |
3 | 26:104 | وَإِنَّ رَبَّكَ لَهُوَ ٱلْعَزِيزُ ٱلرَّحِيمُ |
4 | 26:122 | وَإِنَّ رَبَّكَ لَهُوَ ٱلْعَزِيزُ ٱلرَّحِيمُ |
5 | 26:140 | وَإِنَّ رَبَّكَ لَهُوَ ٱلْعَزِيزُ ٱلرَّحِيمُ |
6 | 26:159 | وَإِنَّ رَبَّكَ لَهُوَ ٱلْعَزِيزُ ٱلرَّحِيمُ |
7 | 26:175 | وَإِنَّ رَبَّكَ لَهُوَ ٱلْعَزِيزُ ٱلرَّحِيمُ |
8 | 26:191 | وَإِنَّ رَبَّكَ لَهُوَ ٱلْعَزِيزُ ٱلرَّحِيمُ |
Cluster with 7 ayas
n | p | aya |
---|---|---|
1 | 10:48 | وَيَقُولُونَ مَتَىٰ هَٰذَا ٱلْوَعْدُ إِن كُنتُمْ صَٰدِقِينَ |
2 | 21:38 | وَيَقُولُونَ مَتَىٰ هَٰذَا ٱلْوَعْدُ إِن كُنتُمْ صَٰدِقِينَ |
3 | 27:71 | وَيَقُولُونَ مَتَىٰ هَٰذَا ٱلْوَعْدُ إِن كُنتُمْ صَٰدِقِينَ |
4 | 32:28 | وَيَقُولُونَ مَتَىٰ هَٰذَا ٱلْفَتْحُ إِن كُنتُمْ صَٰدِقِينَ |
5 | 34:29 | وَيَقُولُونَ مَتَىٰ هَٰذَا ٱلْوَعْدُ إِن كُنتُمْ صَٰدِقِينَ |
6 | 36:48 | وَيَقُولُونَ مَتَىٰ هَٰذَا ٱلْوَعْدُ إِن كُنتُمْ صَٰدِقِينَ |
7 | 67:25 | وَيَقُولُونَ مَتَىٰ هَٰذَا ٱلْوَعْدُ إِن كُنتُمْ صَٰدِقِينَ |
Cluster with 7 ayas
n | p | aya |
---|---|---|
1 | 15:40 | إِلَّا عِبَادَكَ مِنْهُمُ ٱلْمُخْلَصِينَ |
2 | 37:40 | إِلَّا عِبَادَ ٱللَّهِ ٱلْمُخْلَصِينَ |
3 | 37:74 | إِلَّا عِبَادَ ٱللَّهِ ٱلْمُخْلَصِينَ |
4 | 37:128 | إِلَّا عِبَادَ ٱللَّهِ ٱلْمُخْلَصِينَ |
5 | 37:160 | إِلَّا عِبَادَ ٱللَّهِ ٱلْمُخْلَصِينَ |
6 | 37:169 | لَكُنَّا عِبَادَ ٱللَّهِ ٱلْمُخْلَصِينَ |
7 | 38:83 | إِلَّا عِبَادَكَ مِنْهُمُ ٱلْمُخْلَصِينَ |
Cluster with 6 ayas
Cluster with 6 ayas
n | p | aya |
---|---|---|
1 | 12:104 | وَمَا تَسْـَٔلُهُمْ عَلَيْهِ مِنْ أَجْرٍ إِنْ هُوَ إِلَّا ذِكْرٌ لِّلْعَٰلَمِينَ |
2 | 26:109 | وَمَآ أَسْـَٔلُكُمْ عَلَيْهِ مِنْ أَجْرٍ إِنْ أَجْرِىَ إِلَّا عَلَىٰ رَبِّ ٱلْعَٰلَمِينَ |
3 | 26:127 | وَمَآ أَسْـَٔلُكُمْ عَلَيْهِ مِنْ أَجْرٍ إِنْ أَجْرِىَ إِلَّا عَلَىٰ رَبِّ ٱلْعَٰلَمِينَ |
4 | 26:145 | وَمَآ أَسْـَٔلُكُمْ عَلَيْهِ مِنْ أَجْرٍ إِنْ أَجْرِىَ إِلَّا عَلَىٰ رَبِّ ٱلْعَٰلَمِينَ |
5 | 26:164 | وَمَآ أَسْـَٔلُكُمْ عَلَيْهِ مِنْ أَجْرٍ إِنْ أَجْرِىَ إِلَّا عَلَىٰ رَبِّ ٱلْعَٰلَمِينَ |
6 | 26:180 | وَمَآ أَسْـَٔلُكُمْ عَلَيْهِ مِنْ أَجْرٍ إِنْ أَجْرِىَ إِلَّا عَلَىٰ رَبِّ ٱلْعَٰلَمِينَ |
Cluster with 5 ayas
Cluster with 5 ayas
Cluster with 5 ayas
n | p | aya |
---|---|---|
1 | 57:1 | سَبَّحَ لِلَّهِ مَا فِى ٱلسَّمَٰوَٰتِ وَٱلْأَرْضِ وَهُوَ ٱلْعَزِيزُ ٱلْحَكِيمُ |
2 | 59:1 | سَبَّحَ لِلَّهِ مَا فِى ٱلسَّمَٰوَٰتِ وَمَا فِى ٱلْأَرْضِ وَهُوَ ٱلْعَزِيزُ ٱلْحَكِيمُ |
3 | 59:24 | هُوَ ٱللَّهُ ٱلْخَٰلِقُ ٱلْبَارِئُ ٱلْمُصَوِّرُ لَهُ ٱلْأَسْمَآءُ ٱلْحُسْنَىٰ يُسَبِّحُ لَهُۥ مَا فِى ٱلسَّمَٰوَٰتِ وَٱلْأَرْضِ وَهُوَ ٱلْعَزِيزُ ٱلْحَكِيمُ |
4 | 61:1 | سَبَّحَ لِلَّهِ مَا فِى ٱلسَّمَٰوَٰتِ وَمَا فِى ٱلْأَرْضِ وَهُوَ ٱلْعَزِيزُ ٱلْحَكِيمُ |
5 | 62:1 | يُسَبِّحُ لِلَّهِ مَا فِى ٱلسَّمَٰوَٰتِ وَمَا فِى ٱلْأَرْضِ ٱلْمَلِكِ ٱلْقُدُّوسِ ٱلْعَزِيزِ ٱلْحَكِيمِ |
CC-BY Dirk Roorda