Signs¶

Signs are the building blocks in the transcriptions. They correspond to the individual "glyphs" on the tablet.

However, we have inserted a few empty signs, which we can leave out subsequently ...

We need a few extra modules.

In [1]:

%load_ext autoreload
%autoreload 2

In [2]:

import os
import collections
from textwrap import dedent
from IPython.display import Markdown
from tf.app import use

In [3]:

A = use("Nino-cunei/uruk", hoist=globals())

Locating corpus resources ...

app: ~/text-fabric-data/github/Nino-cunei/uruk/app

data: ~/text-fabric-data/github/Nino-cunei/uruk/tf/1.0

Text-Fabric: Text-Fabric API 11.3.0, Nino-cunei/uruk/app v3, Search Reference
Data: Nino-cunei - uruk 1.0, Character table, Feature docs

Node types

Name	# of nodes	# slots/node	% coverage
tablet	6364	22.01	100
face	9456	14.10	95
column	14023	9.34	93
line	35842	3.61	92
case	9651	3.46	24
cluster	32753	1.03	24
quad	3794	2.05	6
comment	11090	1.00	8
sign	140094	1.00	100

Sets: no custom sets
Features:

Uruk IV/III: Proto-cuneiform tablets

catalogId

str

identifier of tablet in catalog (http://www.flutopedia.com/tablets.htm)

crossref

str

damage

int

indicates damage of signs or quads,corresponds to #-flag in transcription

depth

int

excavation

str

excavation number of tablet

fragment

str

level between tablet and face

fullNumber

str

the combination of face type and column number on columns

grapheme

str

name of a grapheme (glyph)

identifier

str

additional information pertaining to the name of a face

modifier

str

indicates modifcation of a sign; corresponds to sign@letter in transcription. if the grapheme is a repeat, the modification applies to the whole repeat.

modifierFirst

str

indicates the order between modifiers and variants on the same object; if 1, modifiers come before variants

modifierInner

str

indicates modifcation of a sign within a repeatcorresponds to sign@letter in transcription

name

str

name of tablet

number

str

number of a column or line or case

otype

str

period

str

period that characterises the tablet corpus

prime

int

indicates the presence/multiplicity of a prime (single quote)

remarkable

int

corresponds to ! flag in transcription

repeat

int

number indicating the number of repeats of a grapheme,especially in numerals; -1 comes from repeat N in transcription

srcLn

str

transcribed line

srcLnNum

int

line number in transcription file

terminal

str

text

str

text of comment nodes

type

str

type of a face; type of a comment; type of a cluster;type of a sign

uncertain

int

corresponds to ?-flag in transcription

variant

str

allograph for a sign, corresponds to ~x in transcription

variantOuter

str

allograph for a quad, corresponds to ~x in transcription

written

str

corresponds to !(xxx) flag in transcription

comments

none

links comment nodes to their targets

op

str

operator connecting left to right operand in a quad

oslots

none

sub

none

connects line or case with sub-cases, quad with sub-quads; clusters with sub-clusters

Text-Fabric API: names N F E L T S C TF directly usable

data: ~/text-fabric-data/github/Nino-cunei/uruk/sources/cdli/images

Found 2095 ideograph linearts

Found 2724 tablet linearts

Found 5495 tablet photos

Showing signs¶

The main characteristic of a sign is its grapheme. Everything we do with signs, is complicated by the fact that signs can be repeated, and augmented with primes, variants, modifiers and flags.

Before we go on, we call up our example tablet.

If you want to output multiple text items in an output cell, you have to print() it.

In [4]:

pNum = "P005381"
query = """
tablet catalogId=P005381
"""
results = A.search(query)
A.show(results, withNodes=True)

  0.00s 1 result

result 1

P005381

tablet:148166 P005381

MSVO 3, 70uruk-iiicatalogId=P005381

comment:178162

atf: lang qpc

face:156932 obverse

column:190362 1

line:254173 1

case:167736 1a

106585 2(N14)

106586 SZE~a

106587 SAL

106588 TUR3~a

106589 NUN~a

case:167737 1b

106590 3(N19)

quad:143013

106591 GISZ

.

106592 TE

line:254174 2

106593 1(N14)

106594 NAR

106595 NUN~a

106596 SIG7

line:254175 3

106597 2(N04)#

106598 PIRIG~b1

106599 SIG7

106600 URI3~a

106601 NUN~a

column:190363 2

line:254176 1

106602 3(N04)

quad:143014

106603 GISZ

.

106604 TE

106605 GAR

quad:143015

106606 SZU2

.

quad:143016

quad:143017

106607 HI

+

106608 1(N57)

+

quad:143018

106609 HI

+

106610 1(N57)

106611 GI4~a

line:254177 2

106612 GU7

106613 AZ

106614 SI4~f

face:156933 reverse

column:190364 1

line:254178 1

106615 3(N14)

106616 SZE~a

line:254179 2

106617 3(N19)

106618 5(N04)

line:254180 3

106619 GU7

column:190365 2

line:254181 1

106620 AZ

106621 SI4~f

We navigate to the last sign in line 1 in column 2 on the obverse face:

In [5]:

case = A.nodeFromCase((pNum, "obverse:2", "1"))
sign1 = L.d(case, otype="sign")[-1]
print(sign1)

That must be the right bar code.

We can retrieve the ATF transliteration:

In [6]:

print(A.atfFromSign(sign1))

GI4~a

Note that we get the ATF for a sign by means of A.atfFromSign(node). We get also the augments such as primes and modifiers and variant. We get the flags if we say so by flags=True.

Take for example the first sign on line 3 in column 1 on the obverse face:

In [7]:

case = A.nodeFromCase((pNum, "obverse:1", "3"))
sign2 = L.d(case, otype="sign")[0]
print(sign2)
print(A.atfFromSign(sign2))
print(A.atfFromSign(sign2, flags=True))

106597
2(N04)
2(N04)#

Secondly, we want to get pointers to the locations of these signs in the corpus.

In [8]:

A.pretty(sign1, withNodes=True)
A.pretty(sign2, withNodes=True)

P005381 obverse:2:1

106611 GI4~a

P005381 obverse:1:3

106597 2(N04)#

Click the links below sign and you are taken to the CDLI page for this tablet.

If we want to enlarge the sign, we can call it up with the lineart function.

In [9]:

A.lineart([sign1, sign2], width=200)

GI4~a

2(N04)

N.B.

For concepts that span one or more transliteration lines, such as tablet, face, column, line, case, comment, you can get the source material by requesting the feature srcLn, as we have seen before.

For inline concepts, such as clusters, quads, and signs, there are functions in A..

For signs we have:

atfFromSign(sign, flags=False)

Returns the ATF representation for a sign, including primes, repeats, variants, modifiers, and, optionally, flags.

The unaugmented transliteration of a single sign can be obtained from the feature grapheme:

In [10]:

print(F.grapheme.v(sign1))
print(F.grapheme.v(sign2))

GI4
N04

Let's pretty-print the line in which sign2 occurs:

In [11]:

A.pretty(L.u(sign2, otype="line")[0])

P005381 obverse:1:3

line 3

2(N04)#

PIRIG~b1

SIG7

URI3~a

NUN~a

Occurrences of a sign¶

Now we are using something we learned before: we want all signs with exactly the grapheme GU7, regardless of augments or flags:

In [12]:

gu7s = F.grapheme.s("GU7")
len(gu7s)

Out[12]:

Or, with a search template:

In [13]:

results = A.search(
    """
sign grapheme=GU7
"""
)

  0.05s 314 results

With `table()` and `show()`¶

The simplest way to show the results is with A.table() for a compact tabular view, or with A.show() with a full context view.

We show a tabular view of 3 occurrences, including node numbers. The show view can be quite unwieldy, so we show a only 3 tablets.

In [14]:

A.table(results, withNodes=True, end=3)

n	p	sign
1	P001705 obverse:3:1	2456 GU7
2	P001719 obverse:1:1	2660 GU7
3	P001951 obverse:3:2	3883 GU7

In [15]:

A.show(results, end=3, showGraphics=False)

result 1

P001705

tablet P001705

ATU 6, pl. 003, W 10594+uruk-iiiW 10594 + W 10599 + W 10600

comment

atf: lang qpc

face obverse

column 1

line 1

cluster ?

...

grapheme=…

cluster ?

...

grapheme=…

cluster ?

column 2

line 1

2(N47)#

grapheme=N47

6(N20)#

grapheme=N20

4(N05)#

grapheme=N05

cluster ?

...

grapheme=…

cluster ?

X

grapheme=X

line 2

cluster ?

...

grapheme=…

cluster ?

...

grapheme=…

cluster ?

line 3

cluster ?

...

grapheme=…

cluster ?

...

grapheme=…

cluster ?

column 3

line 1

2(N47)#

grapheme=N47

6(N20)#

grapheme=N20

5(N05)#?

grapheme=N05

1(N42~a)#

grapheme=N42

1(N25)#

grapheme=N25

1(N28~c)#?

grapheme=N28

GU7#?

grapheme=GU7

cluster ?

...

grapheme=…

cluster ?

line 2

3(N20)#?

grapheme=N20

3(N42~a)#

grapheme=N42

1(N28~c)#?

grapheme=N28

BA

grapheme=BA

line 3

grapheme=

face reverse

column 1

line 1

case 1a

cluster ?

...

grapheme=…

cluster ?

...

grapheme=…

cluster ?

SZENNUR~a#?

grapheme=SZENNUR

case 1b

2(N47)

grapheme=N47

line 2

case 2a

cluster ?

...

grapheme=…

cluster ?

...

grapheme=…

cluster ?

U4

grapheme=U4

KI

grapheme=KI

case 2b

6(N20)

grapheme=N20

4(N05)

grapheme=N05

line 3

case 3a

cluster ?

...

grapheme=…

cluster ?

...

grapheme=…

cluster ?

U4

grapheme=U4

KI

grapheme=KI

case 3b

1(N01)#

grapheme=N01

1(N39~a)#

grapheme=N39

1(N24)#

grapheme=N24

1(N28)#

grapheme=N28

line 4

cluster ?

...

grapheme=…

cluster ?

result 2

P001719

tablet P001719

ATU 6, pl. 004, W 10608uruk-iiiW 10608

comment

atf: lang qpc

face obverse

column 1

line 1

cluster ?

...

grapheme=…

cluster ?

1(N14)#

grapheme=N14

2(N01)#

grapheme=N01

GU7#?

grapheme=GU7

cluster ?

...

grapheme=…

cluster ?

line 2

cluster ?

...

grapheme=…

cluster ?

GUM~b#

grapheme=GUM

cluster ?

...

grapheme=…

cluster ?

column 2

line 1

cluster ?

...

grapheme=…

cluster ?

...

grapheme=…

cluster ?

result 3

P001951

tablet P001951

ATU 6, pl. 032, W 14109uruk-iiiW 14109

comment

atf: lang qpc

face obverse

column 1

line 1

1(N01)

grapheme=N01

2(N58)

grapheme=N58

ZATU675~d

grapheme=ZATU675

NAGA~a

grapheme=NAGA

column 2

line 1

1(N01)

grapheme=N01

ZATU659

grapheme=ZATU659

BU~a?

grapheme=BU

NAM2

grapheme=NAM2

PAP~a

grapheme=PAP

line 2

SZITA~a3

grapheme=SZITA

quad

U4

grapheme=U4

x

1(N01)

grapheme=N01

E2~b

grapheme=E2

line 3

1(N57)

grapheme=N57

DU

grapheme=DU

E2~b

grapheme=E2

NUNUZ~a1#

grapheme=NUNUZ

line 4

case 4a

1(N57)

grapheme=N57

NAGA~a#

grapheme=NAGA

3(N57)#

grapheme=N57

case 4b

1(N57)

grapheme=N57

E2~b#

grapheme=E2

X

grapheme=X

line 5

case 5a

1(N01)

grapheme=N01

ZATU659

grapheme=ZATU659

NIR~a#

grapheme=NIR

case 5b

SU~a

grapheme=SU

PAP~a

grapheme=PAP

3(N57)

grapheme=N57

line 6

1(N57)

grapheme=N57

EN~a

grapheme=EN

SAG

grapheme=SAG

SZITA~a1

grapheme=SZITA

line 7

1(N57)

grapheme=N57

PAP~a

grapheme=PAP

GA2~a1

grapheme=GA2

GISZ

grapheme=GISZ

ERIM~a#?

grapheme=ERIM

column 3

line 1

1(N57)

grapheme=N57

BU~a#?

grapheme=BU

SZU

grapheme=SZU

line 2

GU7

grapheme=GU7

As a plain list¶

There are a few hundred occurrences, we show a bit more context for them, like we did before. We show the full grapheme, with all its augments, and flags. We also show the full source line.

In [16]:

for g in gu7s[0:10]:
    t = L.u(g, otype="tablet")[0]
    cl = A.lineFromNode(g)
    pNum = T.sectionFromNode(t)[0]
    gRep = A.atfFromSign(g, flags=True)

    line = F.srcLn.v(cl)

    print(f"{gRep:<7} {pNum} {line}")

GU7#?   P001705 1. 2(N47)# 6(N20)# 5(N05)#? 1(N42~a)# 1(N25)# 1(N28~c)#? , GU7#? [...] 
GU7#?   P001719 1. [...] 1(N14)# 2(N01)# , GU7#? [...] 
GU7     P001951 2. , GU7 
GU7#    P002002 1. , [...] EN~a# |SILA3~axSZE~a@t|# X [...] GU7# 
GU7#    P002035 2. , GU7# 
GU7#    P002062 1. [...] , [...] GU7# 
GU7     P002100 1. , [...] GU7 
GU7#    P002370 1. 1(N14) 1(N01) , GU7# [...] 
GU7     P002510 1. 1(N34) 1(N14) 1(N01) , GIR2~a GU7 
GU7     P002524 2. , GU7

As a linked table¶

We can make it more user friendly: we can link each occurrence to its page on CDLI, and put everything in a Markdown table.

We have a function to generate the link: A.cdli().

We build a markdown table.

We write a function for this, because we want to do it again.

First we use the function to write the first 10 to the screen, and then to write the whole set to a directory on your file system.

In [17]:

def showSigns(signs, amount=None):

    markdown = dedent("""\
    sign | tablet | line
    ---- | ------ | ----\
    """).strip()

    for g in signs if amount is None else signs[0:amount]:
        t = L.u(g, otype="tablet")[0]
        cl = A.lineFromNode(g)
        gRep = A.atfFromSign(g, flags=True)
        line = F.srcLn.v(cl).replace("|", "&#124;")

        markdown += f"\n{gRep} | {A.cdli(t, asString=True)} | {line}"

    markdown += "\n"
    return markdown

In [18]:

Markdown(showSigns(gu7s, 3))

Out[18]:

sign	tablet	line
GU7#?	P001705	1. 2(N47)# 6(N20)# 5(N05)#? 1(N42~a)# 1(N25)# 1(N28~c)#? , GU7#? [...]
GU7#?	P001719	1. [...] 1(N14)# 2(N01)# , GU7#? [...]
GU7	P001951	2. , GU7

A bit more please ...

In [19]:

Markdown(showSigns(gu7s, 10))

Out[19]:

sign	tablet	line
GU7#?	P001705	1. 2(N47)# 6(N20)# 5(N05)#? 1(N42~a)# 1(N25)# 1(N28~c)#? , GU7#? [...]
GU7#?	P001719	1. [...] 1(N14)# 2(N01)# , GU7#? [...]
GU7	P001951	2. , GU7
GU7#	P002002	1. , [...] EN~a# \|SILA3~axSZE~a@t\|# X [...] GU7#
GU7#	P002035	2. , GU7#
GU7#	P002062	1. [...] , [...] GU7#
GU7	P002100	1. , [...] GU7
GU7#	P002370	1. 1(N14) 1(N01) , GU7# [...]
GU7	P002510	1. 1(N34) 1(N14) 1(N01) , GIR2~a GU7
GU7	P002524	2. , GU7

Everything to file¶

We give you the whole list, in a Markdown file, on your local system.

In [20]:

if not os.path.exists(A.tempDir):
    os.makedirs(A.tempDir, exist_ok=True)
with open(f"{A.tempDir}/gu7.md", "w") as fh:
    fh.write(showSigns(gu7s))

print(f"data written to file {A.tempDir}/gu7.md")

data written to file /Users/me/text-fabric-data/github/Nino-cunei/uruk/_temp/gu7.md

Have a look!

Tip: Open the file in Atom. Switch to preview by Ctr+Shift+M (in Atom).

Again, the tablet links are clickable, and bring you straight to CDLI.

Frequency lists¶

We use a bit more power of Text-Fabric by generating frequency lists.

Graphemes¶

We just studied the GU7 grapheme a bit. Suppose we want to get an overview of all graphemes?

There is a generic Text-Fabric function to give us that. For each feature you can call up a frequency list of its values.

In [21]:

graphemes = F.grapheme.freqList()
len(graphemes)

Out[21]:

We show the top-20:

In [22]:

graphemes[0:20]

Out[22]:

(('…', 29413),
 ('N01', 21645),
 ('', 12440),
 ('X', 6870),
 ('N14', 5898),
 ('EN', 1950),
 ('N34', 1831),
 ('N57', 1826),
 ('SZE', 1334),
 ('GAL', 1180),
 ('DUG', 1084),
 ('U4', 1023),
 ('AN', 1020),
 ('PAP', 876),
 ('SAL', 876),
 ('NUN', 870),
 ('E2', 854),
 ('GI', 850),
 ('BA', 781),
 ('SANGA', 733))

N.B.:

empty graphemes: ('', 12440), These have been inserted by the conversion to Text-Fabric inside comments, in order to link them to the tablets.
ellipsis graphemes: correspond to the ... in ATF, usually within an uncertainty cluster [...]

Augments¶

We can quickly get an overview of all kinds of augments: primes ,variants, modifiers, flags.

Prime¶

The prime is a feature with values: 2, 1 or 0. The number indicates the number of primes. Below you see how often that occurs. Note that we count all primes here: on signs, case numbers and column numbers.

In [23]:

for (value, frequency) in F.prime.freqList():
    print(f"{frequency:>5} x {value}")

 5164 x 1
    1 x 2

Variant¶

The variant or allograph is what occurs after the grapheme and after the ~ symbol, which should be digits and/or lowercase letters except the x.

Here is the frequency list of variant values.

In [24]:

for (value, frequency) in F.variant.freqList():
    print(f"{frequency:>5} x {value}")

Modifier¶

The modifier is what occurs after the grapheme and after the @ symbol It consists of digits and/or lowercase letters except the x.

Sometimes modifiers occur inside a repeat, then we have stored the modifier in the feature modifierInner, as in

7(N34@f)

Here is the frequency list of modifier and modifierInner values.

In [25]:

for (value, frequency) in F.modifier.freqList():
    print(f"{frequency:>5} x {value}")

  634 x g
  262 x t
   35 x n
    6 x r
    4 x s
    1 x c
    1 x h

In [26]:

for (value, frequency) in F.modifierInner.freqList():
    print(f"{frequency:>5} x {value}")

   25 x f
    1 x r
    1 x v

Full signs¶

We make a frequency list of all full signs, i.e. the grapheme including variant, modifier, and prime. We show them as they appear in transcriptions.

We only deal with instances which are not contained in a quad.

This is no longer the frequency distribution of the values of a single feature, so we have to do the coding ourselves.

In [27]:

fullGraphemes = collections.Counter()

for n in F.otype.s("sign"):
    grapheme = F.grapheme.v(n)
    if grapheme == "" or grapheme == "…" or grapheme == "X":
        continue
    fullGrapheme = A.atfFromSign(n)
    fullGraphemes[fullGrapheme] += 1

len(fullGraphemes)

Out[27]:

Or with a query:

In [28]:

query = """
sign type=ideograph|numeral
"""
fullGraphemesQ = {A.atfFromSign(r[0]) for r in A.search(query, silent=True)}
len(fullGraphemesQ)

Out[28]:

There! We have counted all incarnations of full graphemes, and there are 1476 distinct ones.

We show the top-20, sorted by frequency.

We specify a key function, that given an (value, amount) pair returns (-amount, value). This determines the order after sorting. Signs with a high value of amount come before signs with a low value.

In [29]:

for (value, frequency) in sorted(
    fullGraphemes.items(),
    key=lambda x: (-x[1], x[0]),
)[0:20]:
    print(f"{frequency:>5} x {value}")

12983 x 1(N01)
 3080 x 2(N01)
 2584 x 1(N14)
 1830 x EN~a
 1598 x 3(N01)
 1357 x 2(N14)
 1294 x 5(N01)
 1294 x SZE~a
 1164 x GAL~a
 1117 x 4(N01)
 1022 x U4
 1020 x AN
  999 x 1(N34)
  876 x SAL
  851 x PAP~a
  849 x GI
  791 x 3(N14)
  789 x 1(N57)
  781 x BA
  719 x NUN~a

Writing results to file¶

We also want to write the results to files in your _temp directory, within this repo.

writeFreqs writes distribution data of data items called dataName to a file fileName.txt. In fact, it writes two files:

fileName-alpha.txt, ordered by data items
fileName-freq.txt, ordered by frequency.

In [30]:

def writeFreqs(fileName, data, dataName):
    print(f"There are {len(data)} {dataName}s")

    for (sortName, sortKey) in (
        ("alpha", lambda x: (x[0], -x[1])),
        ("freq", lambda x: (-x[1], x[0])),
    ):
        with open(f"{A.tempDir}/{fileName}-{sortName}.txt", "w") as fh:
            for (item, freq) in sorted(data, key=sortKey):
                if item != "":
                    fh.write(f"{freq:>5} x {item}\n")

In [31]:

writeFreqs("grapheme-plain", F.grapheme.freqList(), "bare grapheme")

There are 632 bare graphemes

In [32]:

writeFreqs("grapheme-full", fullGraphemes.items(), "full grapheme")

There are 1476 full graphemes

Now have a look at your {{A.tempDir}} and you see two generated files:

graphemes-plain-alpha.txt (sorted by grapheme)
graphemes-plain-freq.txt (sorted by frequency)
graphemes-full-alpha.txt (sorted by grapheme)
graphemes-full-freq.txt (sorted by frequency)

Next¶

quads

Things never stay simple ...

All chapters: start imagery steps search calc signs quads jumps cases

CC-BY Dirk Roorda

Signs¶

Showing signs¶

Occurrences of a sign¶

With table() and show()¶

As a plain list¶

As a linked table¶

Everything to file¶

Frequency lists¶

Graphemes¶

Augments¶

Prime¶

Variant¶

Modifier¶

Full signs¶

Writing results to file¶

Next¶

With `table()` and `show()`¶