from tf.app import collect
corpora = dict(
# Descartes=("github", "CLARIAH", "descartes-tf"),
# FerdinandHuyck=("github", "CLARIAH", "wp6-ferdinandhuyck"),
# Missieven=("github", "CLARIAH", "wp6-missieven"),
# Daghregisters=("github", "CLARIAH", "wp6-daghregisters"),
# BHSA=("github", "ETCBC", "bhsa"),
# DSS=("github", "ETCBC", "dss"),
# Dhammapada=("github", "ETCBC", "dhammapada"),
# N1904=("github", "ETCBC", "nestle1904"),
# Peshitta=("github", "ETCBC", "peshitta"),
# SyrNT=("github", "ETCBC", "syrnt"),
# NinMed=("github", "Nino-cunei", "ninmed"),
# OldBabylonian=("github", "Nino-cunei", "oldbabylonian"),
# OldAssyrian=("github", "Nino-cunei", "oldassyrian"),
# Uruk=("github", "Nino-cunei", "uruk"),
Athenaeus=("github", "pthu", "athenaeus"),
# Quran=("github", "q-ran", "quran"),
Fusus=("github", "among", "fusus"),
)
otherCorpora = dict(
LXX=("github", "CenterBLC", "LXX"),
NA=("github", "CenterBLC", "NA"),
SBLGNT=("github", "CenterBLC", "SBLGNT"),
Tischendorf=("codykingham", "tischendorf_tf"),
SamaritanPentateuch=("DT-UCPH", "sp"),
Ugaritic=("DT-UCPH", "cuc"),
)
for (corpus, (backend, org, repo)) in corpora.items():
print(f"=== {corpus} ===>")
collect(backend, org, repo)
=== Descartes ===>
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
volume | 8 | 85241.88 | 100 |
letter | 725 | 940.60 | 100 |
page | 2884 | 236.45 | 100 |
postscriptum | 56 | 46.79 | 0 |
opener | 545 | 1.97 | 0 |
closer | 541 | 13.10 | 1 |
address | 86 | 15.22 | 0 |
head | 725 | 23.37 | 2 |
p | 8438 | 80.82 | 100 |
sentence | 13074 | 50.14 | 96 |
hi | 5972 | 4.63 | 4 |
formula | 6200 | 1.21 | 1 |
figure | 319 | 1.00 | 0 |
word | 681935 | 1.00 | 100 |
3
CLARIAH/descartes-tf
/Users/me/github/CLARIAH/descartes-tf/app
.bold {
font-weight: bold;
}
.italic {
font-style: italic;
}
.margin {
position: relative;
top: -0.3em;
font-weight: bold;
color: #0000ee;
}
.sub {
vertical-align: sub;
font-size: small;
}
.sup {
vertical-align: super;
font-size: small;
}
<code>letter 1:1001</code>
layoutOrig
}}about
https://github.com/{org}/{repo}/blob/main/docs/transcription{docExt}
''
True
True
0
True
True
clone
/Users/me/github/CLARIAH/descartes-tf/_temp
Descartes = Descartes, all letters
source/illustrations
Similar Sentences
CLARIAH
parallels/tf
descartes-tf
CLARIAH
/tf
descartes-tf
1.1
http://emlo-portal.bodleian.ox.ac.uk/collections/?catalogue=rene-descartes
See how this corpus is included in the Bodleian catalog
url
True
tex
{notation}
senderloc recipientloc
{id} {date} from {sender} to {recipient}
{id} {date} from {sender} to {recipient}
True
{n}
{n}
p. {n}
{n}
}{n}
vol. {n}
Data to be zipped: OK app (v1.1 285d31) : ~/github/CLARIAH/descartes-tf/app OK main data (v1.1 285d31) : ~/github/CLARIAH/descartes-tf/tf/1.1 OK graphics (v1.1 285d31) : ~/github/CLARIAH/descartes-tf/source/illustrations OK module /parallels/tf (v1.1 285d31) : ~/github/CLARIAH/descartes-tf/parallels/tf/1.1 Writing zip file ... 0.51s ~/Downloads/github/CLARIAH/descartes-tf/complete.zip === FerdinandHuyck ===>
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
text | 1 | 218025.00 | 100 |
body | 1 | 218018.00 | 100 |
div | 42 | 5190.90 | 100 |
chapter | 44 | 4963.18 | 100 |
fileDesc | 1 | 299.00 | 0 |
editionStmt | 1 | 268.00 | 0 |
p | 3725 | 58.04 | 99 |
chunk | 3833 | 56.93 | 100 |
lg | 41 | 23.34 | 0 |
ebook | 1 | 21.00 | 0 |
pod | 1 | 21.00 | 0 |
note | 9 | 20.89 | 0 |
sourceDesc | 1 | 16.00 | 0 |
bibl | 2 | 13.00 | 0 |
revisionDesc | 1 | 12.00 | 0 |
q | 27 | 9.04 | 0 |
head | 86 | 8.40 | 0 |
titleStmt | 1 | 8.00 | 0 |
l | 122 | 7.80 | 0 |
interpGrp | 1 | 7.00 | 0 |
change | 2 | 6.00 | 0 |
publicationStmt | 1 | 5.00 | 0 |
title | 3 | 5.00 | 0 |
item | 2 | 4.00 | 0 |
hi | 602 | 3.50 | 1 |
author | 3 | 3.00 | 0 |
imprint | 2 | 3.00 | 0 |
encodingDesc | 1 | 2.00 | 0 |
notesStmt | 1 | 2.00 | 0 |
order | 2 | 2.00 | 0 |
availability | 3 | 1.67 | 0 |
name | 268 | 1.21 | 0 |
idno | 9 | 1.11 | 0 |
blurb | 2 | 1.00 | 0 |
colofon | 2 | 1.00 | 0 |
date | 4 | 1.00 | 0 |
figure | 5 | 1.00 | 0 |
interp | 7 | 1.00 | 0 |
price | 2 | 1.00 | 0 |
pubPlace | 2 | 1.00 | 0 |
publisher | 2 | 1.00 | 0 |
respStmt | 2 | 1.00 | 0 |
titlepage | 2 | 1.00 | 0 |
xptr | 5 | 1.00 | 0 |
word | 218380 | 1.00 | 100 |
3
CLARIAH/wp6-ferdinandhuyck
/Users/me/github/CLARIAH/wp6-ferdinandhuyck/app
.r_.r_italic,.r_.r_italics {
font-style: italic;
color: #000000;
}
.r_.r_bold {
font-weight: bold;
color: #000000;
}
.r_.r_underline {
text-decoration: underline;
color: #000000;
}
.r_.r_center {
text-align: center;
color: #000000;
}
.r_.r_large {
font-size: large;
color: #000000;
}
.r_.r_spaced {
letter-spacing: .2rem;
color: #000000;
}
.r_.r_margin {
position: relative;
top: -0.3em;
font-weight: bold;
color: #0000ee;
}
.r_.r_above {
position: relative;
top: -0.3em;
color: #000000;
}
.r_.r_below {
position: relative;
top: 0.3em;
color: #000000;
}
.r_.r_sub {
vertical-align: sub;
font-size: small;
color: #000000;
}
.r_.r_sup, .r_.r_super {
vertical-align: super;
font-size: small;
color: #000000;
}
.r_ {
color: #dd9900;
}
.is_meta {
font-family: monospace;
color: #008800;
}
.is_note {
font-size: small;
color: #dd0055;
}
[]
none
unknown
NA
@
layout
}}about
{docBase}/transcription.md
transcription
layout-orig-full
}True
clone
/Users/me/github/CLARIAH/wp6-ferdinandhuyck/_temp
main
{org} - {repo}
10.5281/zenodo.nnnnnn
[]
CLARIAH
/tf
wp6-ferdinandhuyck
0.1
https://public.{org}.org/{repo}
Show this on the website
en
{webBase}/<1>/<2>/<3>&version={version}
{webBase}/word?version={version}&id=<lid>
True
}}''
Data to be zipped: OK app (v0.1 c11480) : ~/github/CLARIAH/wp6-ferdinandhuyck/app OK main data (v0.1 c11480) : ~/github/CLARIAH/wp6-ferdinandhuyck/tf/0.1 Writing zip file ... 0.15s ~/Downloads/github/CLARIAH/wp6-ferdinandhuyck/complete.zip === Daghregisters ===>
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
volume | 1 | 229055.00 | 100 |
page | 483 | 474.23 | 100 |
para | 4193 | 54.63 | 100 |
line | 20291 | 11.29 | 100 |
word | 229055 | 1.00 | 100 |
3
CLARIAH/wp6-daghregisters
/Users/me/github/CLARIAH/wp6-daghregisters/app
''
transcription
https://github.com/{org}/{repo}/blob/master/docs/transcription{docExt}
''
{}
True
clone
/Users/me/github/CLARIAH/wp6-daghregisters/_temp
Dagh Registers Dutch East India Company 1640-1641
CLARIAH
/tf
wp6-daghregisters
0.1
{}
Data to be zipped: OK app (v0.2 786263) : ~/github/CLARIAH/wp6-daghregisters/app OK main data (v0.2 786263) : ~/github/CLARIAH/wp6-daghregisters/tf/0.1 Writing zip file ... 0.22s ~/Downloads/github/CLARIAH/wp6-daghregisters/complete.zip === Dhammapada ===>
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
vagga | 26 | 497.00 | 100 |
stanza | 475 | 27.20 | 100 |
sentence | 913 | 14.15 | 100 |
clause | 2328 | 5.55 | 100 |
word | 12922 | 1.00 | 100 |
3
ETCBC/dhammapada
/Users/me/github/ETCBC/dhammapada/app
.trans1 {
color: #0000cc;
}
.extrastanza1 {
font-weight: bold;
}
.clarity1 {
font-family: monospace;
color: #8888ff;
}
.uncertain1 {
font-family: monospace;
color: #888888;
}
.quote1 {
font-style: italic;
}
<code>Vagga 1:1</code>
none
unknown
NA
layoutLatin
}layoutOrig
}layoutPali
}about
{docBase}/transcription.md
transcription
{}
True
clone
/Users/me/github/ETCBC/dhammapada/_temp
Dhammapada-Latine
10.5281/zenodo.1007624
ETCBC
/tf
dhammapada
0.2
https://www.tipitaka.net/tipitaka/dhp
Show this stanza with English translation and comments on tipitaka
{webBase}/verseload.php?verse=<2>
3
}{n}
}{n}
}quote uncertain clarity
}Data to be zipped: OK app (v0.2 0cad6b) : ~/github/ETCBC/dhammapada/app OK main data (v0.2 0cad6b) : ~/github/ETCBC/dhammapada/tf/0.2 Writing zip file ... 0.07s ~/Downloads/github/ETCBC/dhammapada/complete.zip === Peshitta ===>
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
book | 65 | 6566.69 | 100 |
chapter | 1269 | 336.36 | 100 |
verse | 31341 | 13.62 | 100 |
word | 426835 | 1.00 | 100 |
3
ETCBC/peshitta
/Users/me/github/ETCBC/peshitta/app
''
none
unknown
NA
transcription
{docBase}/transcription-{version}{docExt}#<feature>
''
{}
True
clone
/Users/me/github/ETCBC/peshitta/_temp
Peshitta (Old Testament)
10.5281/zenodo.1463675
ETCBC
/tf
peshitta
0.2
{urlGh}/{org}/{repo}/blob/master/source
Show this document in the Peshitta repository
en
{webBase}/{version}/<1>
{}
syc
Data to be zipped: OK app (v0.5 0b440a) : ~/github/ETCBC/peshitta/app OK main data (v0.5 0b440a) : ~/github/ETCBC/peshitta/tf/0.2 Writing zip file ... 0.45s ~/Downloads/github/ETCBC/peshitta/complete.zip === SyrNT ===>
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
book | 27 | 4060.74 | 100 |
chapter | 260 | 421.69 | 100 |
lexeme | 3038 | 36.09 | 100 |
verse | 7957 | 13.78 | 100 |
word | 109640 | 1.00 | 100 |
3
ETCBC/syrnt
/Users/me/github/ETCBC/syrnt/app
''
none
unknown
NA
transcription
{docBase}/transcription-{version}{docExt}
''
{}
True
clone
/Users/me/github/ETCBC/syrnt/_temp
SyrNT
10.5281/zenodo.1464787
ETCBC
/tf
syrnt
0.1
{urlGh}/{org}/{repo}/blob/master/plain
show this passage in the SyrNT repository
en
{webBase}/{version}/<1>.txt
{lexeme}
word
{lexeme}
vs vt
sp
syc
Data to be zipped: OK app (0.3 8175ce) : ~/github/ETCBC/syrnt/app OK main data (0.3 8175ce) : ~/github/ETCBC/syrnt/tf/0.1 Writing zip file ... 0.64s ~/Downloads/github/ETCBC/syrnt/complete.zip === NinMed ===>
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
document | 34 | 1553.79 | 100 |
face | 54 | 978.31 | 100 |
line | 3082 | 17.14 | 100 |
cluster | 6396 | 2.62 | 32 |
word | 30816 | 1.71 | 100 |
sign | 52829 | 1.00 | 100 |
3
Nino-cunei/ninmed
/Users/me/github/Nino-cunei/ninmed/app
.pnum {
font-family: sans-serif;
font-size: small;
font-weight: bold;
color: #444444;
}
.period {
font-family: monospace;
font-size: medium;
font-weight: bold;
color: #0000bb;
}
/* LANGUAGE: superscript and subscript */
/* cluster */
.det {
vertical-align: super;
}
/* cluster */
.lang {
vertical-align: sub;
}
/* REDACTIONAL: line over or under */
/* flag */
.collated {
font-weight: bold;
text-decoration: underline;
}
/* cluster */
.excised {
color: #dd0000;
text-decoration: line-through;
}
/* cluster */
.supplied {
color: #0000ff;
text-decoration: overline;
}
/* flag */
.remarkable {
font-weight: bold;
text-decoration: overline;
}
/* UNSURE: italic*/
/* cluster */
.uncertain {
font-style: italic
}
/* flag */
.question {
font-weight: bold;
font-style: italic
}
/* BROKEN: text-shadow */
/* cluster */
.missing {
color: #999999;
text-shadow: #bbbbbb 1px 1px;
}
/* flag */
.damage {
font-weight: bold;
color: #999999;
text-shadow: #bbbbbb 1px 1px;
}
.empty {
color: #ff0000;
}
True
layoutFull
trans
layoutPlain
trans
trans
}trans
}mapping from readings to UNICODE
https://nbviewer.jupyter.org/github/Nino-cunei/tfFromAtf/blob/master/programs/mapReadings.ipynb
about
https://github.com/Nino-cunei/ninmed/blob/master/docs/transcription{docExt}
''
True
clone
/Users/me/github/Nino-cunei/ninmed/_temp
Nineveh Medical Encyclopedia 800 BCE: Cuneiform tablets
10.5281/zenodo.2579207
Nino-cunei
/tf
ninmed
0.3
https://cdli.ucla.edu
Show this document on CDLI
{webBase}/search/search_results.php?SearchMode=Text&ObjectID=<1>
{type}
0
docnumber
}face
}seal ruling note comment tr@en
}collated remarkable question damage det uncertain missing excised supplied lang
True
True
0
akk
Data to be zipped: OK app (v0.3 49eee1) : ~/github/Nino-cunei/ninmed/app OK main data (v0.3 49eee1) : ~/github/Nino-cunei/ninmed/tf/0.3 Writing zip file ... 0.17s ~/Downloads/github/Nino-cunei/ninmed/complete.zip === Athenaeus ===>
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
_book | 1 | 265146.00 | 100 |
head | 1 | 265146.00 | 100 |
book | 15 | 17676.40 | 100 |
hi | 78 | 3114.94 | 92 |
cit | 167 | 1586.02 | 100 |
num | 275 | 949.77 | 99 |
add | 789 | 335.94 | 100 |
chapter | 1328 | 199.66 | 100 |
pb | 1549 | 171.18 | 100 |
p | 1571 | 168.78 | 100 |
quote | 3126 | 84.74 | 100 |
bibl | 5264 | 51.09 | 101 |
l | 11267 | 23.50 | 100 |
_sentence | 14773 | 17.95 | 100 |
word | 265146 | 1.00 | 100 |
3
pthu/athenaeus
/Users/me/github/pthu/athenaeus/app
''
{}
{}
{}
True
clone
/Users/me/github/pthu/athenaeus/_temp
The Deipnosophistae by Athenaeus
pthu
/tf
athenaeus
1.1
{p}
}beta_plain
}grc
Data to be zipped: OK app (v1.1 880dd3) : ~/github/pthu/athenaeus/app OK main data (v1.1 880dd3) : ~/github/pthu/athenaeus/tf/1.1 Writing zip file ... 1.21s ~/Downloads/github/pthu/athenaeus/complete.zip === Fusus ===>
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
piece | 29 | 1413.21 | 100 |
page | 403 | 101.69 | 100 |
sentence | 2441 | 16.79 | 100 |
line | 4369 | 9.38 | 100 |
column | 4459 | 9.19 | 100 |
span | 4459 | 9.19 | 100 |
word | 40983 | 1.00 | 100 |
3
among/fusus
/Users/me/github/among/fusus/app
''
none
unknown
NA
https://{org}.github.io/{repo}/fusus
.html
align
{docBase}/about/transcription
''
{}
True
clone
/Users/me/github/among/fusus/_temp
Fusus Al Hikam, by Ibn Arabi, merged editions Lakhnawi-Afifi
10.5281/zenodo.xx1464787
among
/tf
fusus
0.7
https://{org}.github.io/{repo}/fusus/assets/lakhnawi-with-toc
show this passage in the the original (html derived from pdf)
en
{webBase}.html#p<2>
3
}lettersn lettersp letterst letters_af lettersn_af lettersp_af letterst_af
ara
Data to be zipped: OK app (v0.8 56174d) : ~/github/among/fusus/app OK main data (v0.8 56174d) : ~/github/among/fusus/tf/0.7 Writing zip file ... 0.31s ~/Downloads/github/among/fusus/complete.zip