Notebook

에스더 담화 분석을 위한 데이터 추출¶

이 노트북은 ETCBC의 BHSA자료를 활용하여 에스더서의 언어적 특징 분석을 위한 데이터 프레임 추출을 하는 과정을 보여줍니다.¶

In [8]:

from tf.app import use
import csv

A = use("etcbc/bhsa", hoist=globals())

Locating corpus resources ...

app: ~/text-fabric-data/github/etcbc/bhsa/app

data: ~/text-fabric-data/github/etcbc/bhsa/tf/2021

data: ~/text-fabric-data/github/etcbc/phono/tf/2021

data: ~/text-fabric-data/github/etcbc/parallels/tf/2021

Text-Fabric: Text-Fabric API 11.3.0, etcbc/bhsa/app v3, Search Reference
Data: etcbc - bhsa 2021, Character table, Feature docs

Node types

Name	# of nodes	# slots/node	% coverage
book	39	10938.21	100
chapter	929	459.19	100
lex	9230	46.22	100
verse	23213	18.38	100
half_verse	45179	9.44	100
sentence	63717	6.70	100
sentence_atom	64514	6.61	100
clause	88131	4.84	100
clause_atom	90704	4.70	100
phrase	253203	1.68	100
phrase_atom	267532	1.59	100
subphrase	113850	1.42	38
word	426590	1.00	100

Sets: no custom sets
Features:

Parallel Passages

crossref

int

🆗 links between similar passages

BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis

book

str

✅ book name in Latin (Genesis; Numeri; Reges1; ...)

book@ll

str

✅ book name in amharic (ኣማርኛ)

chapter

int

✅ chapter number (1; 2; 3; ...)

code

int

✅ identifier of a clause atom relationship (0; 74; 367; ...)

det

str

✅ determinedness of phrase(atom) (det; und; NA.)

domain

str

✅ text type of clause (? (Unknown); N (narrative); D (discursive); Q (Quotation).)

freq_lex

int

✅ frequency of lexemes

function

str

✅ syntactic function of phrase (Cmpl; Objc; Pred; ...)

g_cons

str

✅ word consonantal-transliterated (B R>CJT BR> >LHJM ...)

g_cons_utf8

str

✅ word consonantal-Hebrew (ב ראשׁית ברא אלהים)

g_lex

str

✅ lexeme pointed-transliterated (B.:- R;>CIJT B.@R@> >:ELOH ...)

g_lex_utf8

str

✅ lexeme pointed-Hebrew (בְּ רֵאשִׁית בָּרָא אֱלֹה)

g_word

str

✅ word pointed-transliterated (B.:- R;>CI73JT B.@R@74> >:ELOHI92JM)

g_word_utf8

str

✅ word pointed-Hebrew (בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים)

gloss

str

🆗 english translation of lexeme (beginning create god(s))

str

✅ grammatical gender (m; f; NA; unknown.)

label

str

✅ (half-)verse label (half verses: A; B; C; verses: GEN 01,02)

language

str

✅ of word or lexeme (Hebrew; Aramaic.)

lex

str

✅ lexeme consonantal-transliterated (B R>CJT/ BR>[ >LHJM/)

lex_utf8

str

✅ lexeme consonantal-Hebrew (ב ראשׁית֜ ברא אלהים֜)

str

✅ lexical set, subclassification of part-of-speech (card; ques; mult)

nametype

str

⚠️ named entity type (pers; mens; gens; topo; ppde.)

nme

str

✅ nominal ending consonantal-transliterated (absent; n/a; JM, ...)

str

✅ grammatical number (sg; du; pl; NA; unknown.)

number

int

✅ sequence number of an object within its context

otype

str

pargr

str

🆗 hierarchical paragraph number (1; 1.2; 1.2.3.4; ...)

pdp

str

✅ phrase dependent part-of-speech (art; verb; subs; nmpr, ...)

pfm

str

✅ preformative consonantal-transliterated (absent; n/a; J, ...)

prs

str

✅ pronominal suffix consonantal-transliterated (absent; n/a; W; ...)

prs_gn

str

✅ pronominal suffix gender (m; f; NA; unknown.)

prs_nu

str

✅ pronominal suffix number (sg; du; pl; NA; unknown.)

prs_ps

str

✅ pronominal suffix person (p1; p2; p3; NA; unknown.)

str

✅ grammatical person (p1; p2; p3; NA; unknown.)

qere

str

✅ word pointed-transliterated masoretic reading correction

qere_trailer

str

✅ interword material -pointed-transliterated (Masoretic correction)

qere_trailer_utf8

str

✅ interword material -pointed-transliterated (Masoretic correction)

qere_utf8

str

✅ word pointed-Hebrew masoretic reading correction

rank_lex

int

✅ ranking of lexemes based on freqnuecy

rela

str

✅ linguistic relation between clause/(sub)phrase(atom) (ADJ; MOD; ATR; ...)

str

✅ part-of-speech (art; verb; subs; nmpr, ...)

str

✅ state of a noun (a (absolute); c (construct); e (emphatic).)

tab

int

✅ clause atom: its level in the linguistic embedding

trailer

str

✅ interword material pointed-transliterated (& 00 05 00_P ...)

trailer_utf8

str

✅ interword material pointed-Hebrew (־ ׃)

txt

str

✅ text type of clause and surrounding (repetion of ? N D Q as in feature domain)

typ

str

✅ clause/phrase(atom) type (VP; NP; Ellp; Ptcp; WayX)

uvf

str

✅ univalent final consonant consonantal-transliterated (absent; N; J; ...)

vbe

str

✅ verbal ending consonantal-transliterated (n/a; W; ...)

vbs

str

✅ root formation consonantal-transliterated (absent; n/a; H; ...)

verse

int

✅ verse number

voc_lex

str

✅ vocalized lexeme pointed-transliterated (B.: R;>CIJT BR> >:ELOHIJM)

voc_lex_utf8

str

✅ vocalized lexeme pointed-Hebrew (בְּ רֵאשִׁית ברא אֱלֹהִים)

str

✅ verbal stem (qal; piel; hif; apel; pael)

str

✅ verbal tense (perf; impv; wayq; infc)

mother

none

✅ linguistic dependency between textual objects

oslots

none

Phonetic Transcriptions

phono

str

🆗 phonological transcription (bᵊ rēšˌîṯ bārˈā ʔᵉlōhˈîm)

phono_trailer

str

🆗 interword material in phonological transcription

Text-Fabric API: names N F E L T S C TF directly usable

에스더서 담화 Sentence를 정리한 CSV 로드¶

데이터 변수 입력

esther_discourse_data esther_letter_data *esther_description_data

In [13]:

data = "description"

In [14]:

sentencelist = []

with open('esther_' + data + '.csv', 'r') as file:
    reader = csv.reader(file)
    for i, row in enumerate(reader):
        if i == 0:  # 첫 번째 행은 스킵
            continue
        # 두 번째 칼럼 값만 추출하여 result 리스트에 추가
        sentencelist.append([int(row[1]),str(row[2]),str(row[3])])

print(sentencelist)

[[1229022, 'author', 'description'], [1229023, 'author', 'description'], [1229024, 'author', 'description'], [1229025, 'author', 'description'], [1229026, 'author', 'description'], [1229027, 'author', 'description'], [1229028, 'author', 'description'], [1229029, 'author', 'description'], [1229030, 'author', 'description'], [1229031, 'author', 'description'], [1229032, 'author', 'description'], [1229033, 'author', 'description'], [1229034, 'author', 'description'], [1229035, 'author', 'description'], [1229036, 'author', 'description'], [1229037, 'author', 'description'], [1229038, 'author', 'description'], [1229039, 'author', 'description'], [1229040, 'author', 'description'], [1229041, 'author', 'description'], [1229043, 'author', 'description'], [1229057, 'author', 'description'], [1229058, 'author', 'description'], [1229060, 'author', 'description'], [1229061, 'author', 'description'], [1229067, 'author', 'description'], [1229068, 'author', 'description'], [1229069, 'author', 'description'], [1229070, 'author', 'description'], [1229071, 'author', 'description'], [1229072, 'author', 'description'], [1229073, 'author', 'description'], [1229074, 'author', 'description'], [1229075, 'author', 'description'], [1229076, 'author', 'description'], [1229077, 'author', 'description'], [1229078, 'author', 'description'], [1229079, 'author', 'description'], [1229080, 'author', 'description'], [1229081, 'author', 'description'], [1229082, 'author', 'description'], [1229083, 'author', 'description'], [1229084, 'author', 'description'], [1229085, 'author', 'description'], [1229086, 'author', 'description'], [1229087, 'author', 'description'], [1229088, 'author', 'description'], [1229089, 'author', 'description'], [1229090, 'author', 'description'], [1229091, 'author', 'description'], [1229092, 'author', 'description'], [1229093, 'author', 'description'], [1229094, 'author', 'description'], [1229095, 'author', 'description'], [1229096, 'author', 'description'], [1229097, 'author', 'description'], [1229098, 'author', 'description'], [1229099, 'author', 'description'], [1229100, 'author', 'description'], [1229101, 'author', 'description'], [1229102, 'author', 'description'], [1229103, 'author', 'description'], [1229104, 'author', 'description'], [1229105, 'author', 'description'], [1229106, 'author', 'description'], [1229107, 'author', 'description'], [1229108, 'author', 'description'], [1229109, 'author', 'description'], [1229110, 'author', 'description'], [1229111, 'author', 'description'], [1229112, 'author', 'description'], [1229113, 'author', 'description'], [1229114, 'author', 'description'], [1229115, 'author', 'description'], [1229116, 'author', 'description'], [1229117, 'author', 'description'], [1229118, 'author', 'description'], [1229119, 'author', 'description'], [1229120, 'author', 'description'], [1229121, 'author', 'description'], [1229122, 'author', 'description'], [1229123, 'author', 'description'], [1229125, 'author', 'description'], [1229126, 'author', 'description'], [1229127, 'author', 'description'], [1229128, 'author', 'description'], [1229129, 'author', 'description'], [1229130, 'author', 'description'], [1229131, 'author', 'description'], [1229132, 'author', 'description'], [1229133, 'author', 'description'], [1229134, 'author', 'description'], [1229135, 'author', 'description'], [1229136, 'author', 'description'], [1229137, 'author', 'description'], [1229145, 'author', 'description'], [1229146, 'author', 'description'], [1229147, 'author', 'description'], [1229149, 'author', 'description'], [1229150, 'author', 'description'], [1229151, 'author', 'description'], [1229152, 'author', 'description'], [1229155, 'author', 'description'], [1229156, 'author', 'description'], [1229157, 'author', 'description'], [1229158, 'author', 'description'], [1229159, 'author', 'description'], [1229160, 'author', 'description'], [1229161, 'author', 'description'], [1229162, 'author', 'description'], [1229163, 'author', 'description'], [1229164, 'author', 'description'], [1229165, 'author', 'description'], [1229166, 'author', 'description'], [1229167, 'author', 'description'], [1229168, 'author', 'description'], [1229169, 'author', 'description'], [1229170, 'author', 'description'], [1229171, 'author', 'description'], [1229172, 'author', 'description'], [1229173, 'author', 'description'], [1229174, 'author', 'description'], [1229175, 'author', 'description'], [1229176, 'author', 'description'], [1229177, 'author', 'description'], [1229178, 'author', 'description'], [1229179, 'author', 'description'], [1229180, 'author', 'description'], [1229181, 'author', 'description'], [1229185, 'author', 'description'], [1229186, 'author', 'description'], [1229191, 'author', 'description'], [1229200, 'author', 'description'], [1229201, 'author', 'description'], [1229202, 'author', 'description'], [1229203, 'author', 'description'], [1229204, 'author', 'description'], [1229205, 'author', 'description'], [1229206, 'author', 'description'], [1229207, 'author', 'description'], [1229208, 'author', 'description'], [1229209, 'author', 'description'], [1229210, 'author', 'description'], [1229211, 'author', 'description'], [1229215, 'author', 'description'], [1229218, 'author', 'description'], [1229219, 'author', 'description'], [1229220, 'author', 'description'], [1229221, 'author', 'description'], [1229226, 'author', 'description'], [1229227, 'author', 'description'], [1229231, 'author', 'description'], [1229232, 'author', 'description'], [1229233, 'author', 'description'], [1229234, 'author', 'description'], [1229235, 'author', 'description'], [1229236, 'author', 'description'], [1229237, 'author', 'description'], [1229238, 'author', 'description'], [1229239, 'author', 'description'], [1229240, 'author', 'description'], [1229241, 'author', 'description'], [1229244, 'author', 'description'], [1229245, 'author', 'description'], [1229250, 'author', 'description'], [1229251, 'author', 'description'], [1229252, 'author', 'description'], [1229253, 'author', 'description'], [1229254, 'author', 'description'], [1229256, 'author', 'description'], [1229258, 'author', 'description'], [1229260, 'author', 'description'], [1229262, 'author', 'description'], [1229263, 'author', 'description'], [1229265, 'author', 'description'], [1229267, 'author', 'description'], [1229268, 'author', 'description'], [1229270, 'author', 'description'], [1229272, 'author', 'description'], [1229280, 'author', 'description'], [1229285, 'author', 'description'], [1229286, 'author', 'description'], [1229287, 'author', 'description'], [1229288, 'author', 'description'], [1229290, 'author', 'description'], [1229291, 'author', 'description'], [1229292, 'author', 'description'], [1229293, 'author', 'description'], [1229297, 'author', 'description'], [1229298, 'author', 'description'], [1229299, 'author', 'description'], [1229300, 'author', 'description'], [1229301, 'author', 'description'], [1229306, 'author', 'description'], [1229307, 'author', 'description'], [1229314, 'author', 'description'], [1229315, 'author', 'description'], [1229318, 'author', 'description'], [1229321, 'author', 'description'], [1229322, 'author', 'description'], [1229323, 'author', 'description'], [1229324, 'author', 'description'], [1229325, 'author', 'description'], [1229326, 'author', 'description'], [1229327, 'author', 'description'], [1229329, 'author', 'description'], [1229330, 'author', 'description'], [1229331, 'author', 'description'], [1229334, 'author', 'description'], [1229336, 'author', 'description'], [1229337, 'author', 'description'], [1229338, 'author', 'description'], [1229339, 'author', 'description'], [1229340, 'author', 'description'], [1229341, 'author', 'description'], [1229342, 'author', 'description'], [1229343, 'author', 'description'], [1229344, 'author', 'description'], [1229345, 'author', 'description'], [1229346, 'author', 'description'], [1229347, 'author', 'description'], [1229348, 'author', 'description'], [1229349, 'author', 'description'], [1229350, 'author', 'description'], [1229351, 'author', 'description'], [1229352, 'author', 'description'], [1229362, 'author', 'description'], [1229368, 'author', 'description'], [1229369, 'author', 'description'], [1229370, 'author', 'description'], [1229372, 'author', 'description'], [1229373, 'author', 'description'], [1229374, 'author', 'description'], [1229375, 'author', 'description'], [1229376, 'author', 'description'], [1229377, 'author', 'description'], [1229378, 'author', 'description'], [1229379, 'author', 'description'], [1229380, 'author', 'description'], [1229381, 'author', 'description'], [1229382, 'author', 'description'], [1229383, 'author', 'description'], [1229384, 'author', 'description'], [1229385, 'author', 'description'], [1229386, 'author', 'description'], [1229387, 'author', 'description'], [1229388, 'author', 'description'], [1229389, 'author', 'description'], [1229390, 'author', 'description'], [1229391, 'author', 'description'], [1229392, 'author', 'description'], [1229393, 'author', 'description'], [1229394, 'author', 'description'], [1229395, 'author', 'description'], [1229396, 'author', 'description'], [1229397, 'author', 'description'], [1229398, 'author', 'description'], [1229399, 'author', 'description'], [1229400, 'author', 'description'], [1229401, 'author', 'description'], [1229409, 'author', 'description'], [1229413, 'author', 'description'], [1229414, 'author', 'description'], [1229415, 'author', 'description'], [1229416, 'author', 'description'], [1229417, 'author', 'description'], [1229418, 'author', 'description'], [1229419, 'author', 'description'], [1229420, 'author', 'description'], [1229421, 'author', 'description'], [1229422, 'author', 'description'], [1229423, 'author', 'description'], [1229424, 'author', 'description'], [1229425, 'author', 'description'], [1229426, 'author', 'description'], [1229427, 'author', 'description'], [1229428, 'author', 'description'], [1229429, 'author', 'description'], [1229430, 'author', 'description'], [1229432, 'author', 'description'], [1229440, 'author', 'description'], [1229441, 'author', 'description'], [1229442, 'author', 'description'], [1229443, 'author', 'description'], [1229444, 'author', 'description'], [1229445, 'author', 'description'], [1229446, 'author', 'description'], [1229447, 'author', 'description'], [1229448, 'author', 'description'], [1229449, 'author', 'description'], [1229450, 'author', 'description'], [1229451, 'author', 'description'], [1229452, 'author', 'description'], [1229453, 'author', 'description'], [1229454, 'author', 'description'], [1229455, 'author', 'description']]

In [15]:

sentencedata = []
authordata = []
ttypdata = []

for sentence in sentencelist:
    wordNode = L.d(sentence[0], otype='word')
    wordNodeCount = len(wordNode)
    sentencedata.extend(list(wordNode))
    
#     aythor와 typ 부분을 wordNode 만큼 반복시켜서 데이터를 매칭
    author = [sentence[1]]*wordNodeCount
    authordata.extend(author)
    
    ttyp = [sentence[2]]*wordNodeCount
    ttypdata.extend(ttyp)

wordNode = tuple(sentencedata)

데이터 출력

In [16]:

c = ''
p = ''
i = 0

f = open('./esther_'+ data + '_data.csv','w')
csvWriter = csv.writer(f)
csvWriter.writerow(['book','lex','sp','vs','vt','ctyp','cnode','ptyp','gloss','speaker','ttype','clausespeaker','clausettype'])

for n in wordNode:
    section = T.sectionFromNode(n)
    scripture = section[0] + str(section[1]) + ':' + str(section[2])
        
    sentenceNode = L.u(n, otype='sentence')
    clauseNode = L.u(n, otype='clause')
    phraseNode = L.u(n, otype='phrase')
    
    if i == 0:
        clauseCode = F.typ.v(clauseNode[0])
        cnode = clauseNode[0]
        phraseCode = F.typ.v(phraseNode[0])
        clauseAuthor = authordata[i]
        clauseTtyp = ttypdata[i]
        
    else: 
        if clauseNode[0] == c:
            clauseCode = ''
            cnode = ''
            clauseAuthor = ''
            clauseTtyp = ''
        else:    
            clauseCode = F.typ.v(clauseNode[0])
            cnode = clauseNode[0]
            clauseAuthor = authordata[i]
            clauseTtyp = ttypdata[i]

        if phraseNode[0] == p:
            phraseCode = ''
        else: 
            phraseCode = F.typ.v(phraseNode[0])
        
    
    verbal_stem = F.vs.v(n)
    verbal_tense = F.vt.v(n)
    
    if verbal_stem == 'NA':
        verbal_stem = ''
    
    if verbal_tense == 'NA':
        verbal_tense = ''
        
    csvWriter.writerow([scripture, F.lex_utf8.v(n),F.sp.v(n),verbal_stem,verbal_tense,clauseCode,cnode,phraseCode,F.gloss.v(L.u(n, otype='lex')[0]),authordata[i],ttypdata[i],clauseAuthor,clauseTtyp])
        
    #print('{}\t{}\t{}\t{}\t{}\t{}\t{}'.format(F.lex_utf8.v(n), F.sp.v(n), verbal_stem, verbal_tense, clauseCode, phraseCode, F.gloss.v(L.u(n, otype='lex')[0])))
    
    c = clauseNode[0]
    p = phraseNode[0]
    i = i + 1

Description 파일 생성¶

에스더 sentence 전체를 추출하여 discourse와 letter 로 분류된 문장 제거

In [3]:

book = 'Esther'
chpNode = T.nodeFromSection((book,))
chpNode

Out[3]:

In [8]:

sentenceNode = L.d(chpNode, otype='sentence')
sentenceNode = list(sentenceNode)
len(sentenceNode)

Out[8]:

In [9]:

data = "description"

# 아래 자료는 discourse와 letter에 해당하는 sentence 데이터
sentencelist = [
    1229042,
    1229044,
    1229045,
    1229046,
    1229047,
    1229048,
    1229049,
    1229050,
    1229051,
    1229052,
    1229053,
    1229054,
    1229055,
    1229056,
    1229062,
    1229063,
    1229064,
    1229065,
    1229066,
    1229124,
    1229138,
    1229139,
    1229140,
    1229141,
    1229142,
    1229143,
    1229144,
    1229148,
    1229182,
    1229183,
    1229184,
    1229187,
    1229188,
    1229189,
    1229190,
    1229192,
    1229193,
    1229194,
    1229195,
    1229196,
    1229197,
    1229198,
    1229199,
    1229212,
    1229213,
    1229214,
    1229216,
    1229217,
    1229222,
    1229223,
    1229224,
    1229225,
    1229228,
    1229229,
    1229230,
    1229242,
    1229243,
    1229246,
    1229247,
    1229248,
    1229249,
    1229257,
    1229259,
    1229261,
    1229264,
    1229266,
    1229269,
    1229271,
    1229273,
    1229274,
    1229275,
    1229276,
    1229277,
    1229278,
    1229279,
    1229281,
    1229282,
    1229283,
    1229284,
    1229289,
    1229294,
    1229295,
    1229296,
    1229302,
    1229303,
    1229304,
    1229305,
    1229308,
    1229309,
    1229310,
    1229311,
    1229312,
    1229313,
    1229316,
    1229317,
    1229319,
    1229320,
    1229328,
    1229332,
    1229333,
    1229335,
    1229353,
    1229354,
    1229355,
    1229356,
    1229357,
    1229358,
    1229359,
    1229360,
    1229361,
    1229363,
    1229364,
    1229365,
    1229366,
    1229367,
    1229402,
    1229403,
    1229404,
    1229405,
    1229406,
    1229407,
    1229408,
    1229410,
    1229411,
    1229412,
    1229059,
    1229153,
    1229154,
    1229255,
    1229371,
    1229431,
    1229433,
    1229434,
    1229435,
    1229436,
    1229437,
    1229438,
    1229439
]

len(sentencelist)

Out[9]:

In [7]:

#위 데이터를 에스더 전체 sentence 데이터에서 제거
sentenceNode2 = [x for x in sentenceNode if x not in sentencelist]
print(sentenceNode2)

sentencedata = []
for sentence in sentenceNode2:
    wordNode = L.d(sentence, otype='word')
    sentencedata.extend(list(wordNode))

wordNode = tuple(sentencedata)

# len(sentenceNode2)

[1229022, 1229023, 1229024, 1229025, 1229026, 1229027, 1229028, 1229029, 1229030, 1229031, 1229032, 1229033, 1229034, 1229035, 1229036, 1229037, 1229038, 1229039, 1229040, 1229041, 1229043, 1229057, 1229058, 1229060, 1229061, 1229067, 1229068, 1229069, 1229070, 1229071, 1229072, 1229073, 1229074, 1229075, 1229076, 1229077, 1229078, 1229079, 1229080, 1229081, 1229082, 1229083, 1229084, 1229085, 1229086, 1229087, 1229088, 1229089, 1229090, 1229091, 1229092, 1229093, 1229094, 1229095, 1229096, 1229097, 1229098, 1229099, 1229100, 1229101, 1229102, 1229103, 1229104, 1229105, 1229106, 1229107, 1229108, 1229109, 1229110, 1229111, 1229112, 1229113, 1229114, 1229115, 1229116, 1229117, 1229118, 1229119, 1229120, 1229121, 1229122, 1229123, 1229125, 1229126, 1229127, 1229128, 1229129, 1229130, 1229131, 1229132, 1229133, 1229134, 1229135, 1229136, 1229137, 1229145, 1229146, 1229147, 1229149, 1229150, 1229151, 1229152, 1229155, 1229156, 1229157, 1229158, 1229159, 1229160, 1229161, 1229162, 1229163, 1229164, 1229165, 1229166, 1229167, 1229168, 1229169, 1229170, 1229171, 1229172, 1229173, 1229174, 1229175, 1229176, 1229177, 1229178, 1229179, 1229180, 1229181, 1229185, 1229186, 1229191, 1229200, 1229201, 1229202, 1229203, 1229204, 1229205, 1229206, 1229207, 1229208, 1229209, 1229210, 1229211, 1229215, 1229218, 1229219, 1229220, 1229221, 1229226, 1229227, 1229231, 1229232, 1229233, 1229234, 1229235, 1229236, 1229237, 1229238, 1229239, 1229240, 1229241, 1229244, 1229245, 1229250, 1229251, 1229252, 1229253, 1229254, 1229256, 1229258, 1229260, 1229262, 1229263, 1229265, 1229267, 1229268, 1229270, 1229272, 1229280, 1229285, 1229286, 1229287, 1229288, 1229290, 1229291, 1229292, 1229293, 1229297, 1229298, 1229299, 1229300, 1229301, 1229306, 1229307, 1229314, 1229315, 1229318, 1229321, 1229322, 1229323, 1229324, 1229325, 1229326, 1229327, 1229329, 1229330, 1229331, 1229334, 1229336, 1229337, 1229338, 1229339, 1229340, 1229341, 1229342, 1229343, 1229344, 1229345, 1229346, 1229347, 1229348, 1229349, 1229350, 1229351, 1229352, 1229362, 1229368, 1229369, 1229370, 1229372, 1229373, 1229374, 1229375, 1229376, 1229377, 1229378, 1229379, 1229380, 1229381, 1229382, 1229383, 1229384, 1229385, 1229386, 1229387, 1229388, 1229389, 1229390, 1229391, 1229392, 1229393, 1229394, 1229395, 1229396, 1229397, 1229398, 1229399, 1229400, 1229401, 1229409, 1229413, 1229414, 1229415, 1229416, 1229417, 1229418, 1229419, 1229420, 1229421, 1229422, 1229423, 1229424, 1229425, 1229426, 1229427, 1229428, 1229429, 1229430, 1229432, 1229440, 1229441, 1229442, 1229443, 1229444, 1229445, 1229446, 1229447, 1229448, 1229449, 1229450, 1229451, 1229452, 1229453, 1229454, 1229455]

Out[7]:

위 데이터 출력 부분 실행

sentencenode2 데이터를 이용해 sentence 통계파일을 만듬.

In [87]:

f = open('./esther_description.csv','w')
csvWriter = csv.writer(f)
csvWriter.writerow(['verse','sentence_node','speakers','type'])

for n in sentenceNode2:
    section = T.sectionFromNode(n)
    scripture = section[0] + ' ' + str(section[1]) + ':' + str(section[2])
        
    csvWriter.writerow([scripture, n, 'author', 'description'])

In [ ]: