scriptVersion="0.5.2"
scriptDate="July 10, 2024"
Ideally, a comprehensive documentation set should be created as part of developing a Text-Fabric dataset. However, in practice, this is not always completed during the initial phase or after changes to features. This Jupyter Notebook contains Python code to automatically generate (and thus ensure consistency) a documentation set for any Text-Fabric dataset. It serves as a robust starting point for the development of a brand new documentation set or as validation for an existing one. One major advantage is that the resulting documentation set is fully hyperlinked, a task that can be laborious if done manually.
The main steps in producing the documentation set are:
The output format can be either Markdown, the standard for feature documentation stored on GitHub using its on-site processor, or HTML, which facilitates local storage and browsing with any web browser.
Your environment should (for obvious reasons) include the Python package Text-Fabric
. Text-Fabric requires at least Python version 3.7.0. If not installed yet, it can be installed using pip
. More details on installing the Text-Fabric package can be found in tf.about.install.
Further it is required to be able to invoke the Text-Fabric data set (either from an online resource, or from a localy stored copy). There are no further requirements as the scripts basicly operate 'stand alone'.
At this step, the Text-Fabric dataset is loaded, which embedded data will be used to create a documentation set.
Which dataset will be loaded is specified in the parameters as detailed below:
A = use ("{GitHub user name}/{repository name}", version="{version}", hoist=globals())
For various options regarding other possible storage locations, and other load options, see the documentation for function use
.
%load_ext autoreload
%autoreload 2
# Loading the Text-Fabric code
# Note: it is assumed Text-Fabric is installed in your environment
from tf.fabric import Fabric
from tf.app import use
# load the app and data
A = use ("ETCBC/BHSA", hoist=globals())
Locating corpus resources ...
Name | # of nodes | # slots / node | % coverage |
---|---|---|---|
book | 39 | 10938.21 | 100 |
chapter | 929 | 459.19 | 100 |
lex | 9230 | 46.22 | 100 |
verse | 23213 | 18.38 | 100 |
half_verse | 45179 | 9.44 | 100 |
sentence | 63717 | 6.70 | 100 |
sentence_atom | 64514 | 6.61 | 100 |
clause | 88131 | 4.84 | 100 |
clause_atom | 90704 | 4.70 | 100 |
phrase | 253203 | 1.68 | 100 |
phrase_atom | 267532 | 1.59 | 100 |
subphrase | 113850 | 1.42 | 38 |
word | 426590 | 1.00 | 100 |
3
ETCBC/BHSA
C:/Users/tonyj/text-fabric-data/github/ETCBC/BHSA/app
gd905e3fb6e80d0fa537600337614adc2af157309
''
<code>Genesis 1:1</code> (use <a href="https://github.com/{org}/{repo}/blob/master/tf/{version}/book%40en.tf" target="_blank">English book names</a>)
g_uvf_utf8
g_vbs
kq_hybrid
languageISO
g_nme
lex0
is_root
g_vbs_utf8
g_uvf
dist
root
suffix_person
g_vbe
dist_unit
suffix_number
distributional_parent
kq_hybrid_utf8
crossrefSET
instruction
g_prs
lexeme_count
rank_occ
g_pfm_utf8
freq_occ
crossrefLCS
functional_parent
g_pfm
g_nme_utf8
g_vbe_utf8
kind
g_prs_utf8
suffix_gender
mother_object_type
none
unknown
NA
{docRoot}/{repo}
''
''
https://{org}.github.io
0_home
{}
True
local
C:/Users/tonyj/text-fabric-data/github/ETCBC/BHSA/_temp
BHSA = Biblia Hebraica Stuttgartensia Amstelodamensis
10.5281/zenodo.1007624
Phonetic Transcriptions
https://nbviewer.jupyter.org/github/etcbc/phono/blob/master/programs/phono.ipynb
10.5281/zenodo.1007636
ETCBC
/tf
phono
Parallel Passages
https://nbviewer.jupyter.org/github/ETCBC/parallels/blob/master/programs/parallels.ipynb
10.5281/zenodo.1007642
ETCBC
/tf
parallels
ETCBC
/tf
BHSA
2021
https://shebanq.ancient-data.org/hebrew
Show this on SHEBANQ
la
True
{webBase}/text?book=<1>&chapter=<2>&verse=<3>&version={version}&mr=m&qw=q&tp=txt_p&tr=hb&wget=v&qget=v&nget=vt
{webBase}/word?version={version}&id=<lid>
v1.8
{typ} {rela}
''
True
{code}
1
''
True
{label}
''
True
gloss
{voc_lex_utf8}
word
orig
{voc_lex_utf8}
{typ} {function}
''
True
{typ} {rela}
1
''
{number}
''
True
{number}
1
''
True
{number}
''
pdp vs vt
lex:gloss
hbo
Note: the first two lines were changed to reflect this tf version
# If the following variable is set, it will be used as title for all pages. It is intended to the describe the dataset in one line
customPageTitleMD="ETCBC/BHSA"
customPageTitleHTML="ETCBC/BHSA"
# Specify the location to store the resulting files, relative to the location of this notebook (without a trailing slash).
resultLocation = "BHSA_TF_features"
# Type of output format ('html' for HTML, 'md' for Mark Down, or 'both' for both HTML and Mark Down)
typeOutput='md'
# HTML table style definition (only relevant for HTML output format)
htmlStyle='<style>\ntable {\nborder-collapse: collapse;\n}\n th, td {\nborder: 1px solid black;\n padding: 8px;\n}\nth {\nfont-weight: bold;\n}\n</style>'
# Limit the number of entries in the frequency tables per node type on each feature description page to this number
tableLimit=10
# This switch can be set to 'True' if you want additional information, such as dictionary entries and file details, to be printed. For basic output, set this switch to 'False'.
verbose=False
# Create the footers for MD and HTML, include today's date
from datetime import datetime
today = datetime.today()
formatted_date = today.strftime("%b. %d, %Y")
footerMD=f'\n\nCreated on {formatted_date} using [Doc4TF version {scriptVersion} ({scriptDate})](https://github.com/tonyjurg/Doc4TF/blob/main/CreateFeatureDoc.ipynb)'
footerHTML=f'\n<p>Created on {formatted_date} using <a href=\"https://github.com/tonyjurg/Doc4TF/blob/main/CreateFeatureDoc.ipynb\">Doc4TF - version {scriptVersion} ({scriptDate})</a></p></body></html>'
The following will create a dictionary containing all relevant information for the loaded node and edge features.
# Initialize an empty dictionary to store feature data
featureDict = {}
import time
overallTime = time.time()
def getFeatureDescription(metaData):
"""
This function looks for the 'description' key in the metadata dictionary. If the key is found,
it returns the corresponding description. If the key is not present, it returns a default
message indicating that no description is available.
Parameters:
metaData (dict): A dictionary containing metadata about a feature.
Returns:
str: The description of the feature if available, otherwise a default message.
"""
return metaData.get('description', "No feature description")
def setDataType(metaData):
"""
This function checks for the 'valueType' key in the metadata. If the key is present, it
returns 'String' if the value is 'str', and 'Integer' for other types. If the 'valueType' key
is not present, it returns 'Unknown'.
Parameters:
metaData (dict): A dictionary containing metadata, including the 'valueType' of a feature.
Returns:
str: A string indicating the determined data type ('String', 'Integer', or 'Unknown').
"""
if 'valueType' in metaData:
return "String" if metaData["valueType"] == 'str' else "Integer"
return "Unknown"
def processFeature(feature, featureType, featureMethod):
"""
Processes a given feature by extracting metadata, description, and data type, and then
compiles frequency data for different node types in a feature dictionary. Certain features
are skipped based on their type. The processed data is added to a global feature dictionary.
Parameters:
feature (str): The name of the feature to be processed.
featureType (str): The type of the feature ('Node' or 'Edge').
featureMethod (function): A function to obtain feature data.
Returns:
None: The function updates a global dictionary with processed feature data and does not return anything.
"""
# Obtain the meta data
featureMetaData = featureMethod(feature).meta
featureDescription = getFeatureDescription(featureMetaData)
dataType = setDataType(featureMetaData)
# Initialize dictionary to store feature frequency data
featureFrequencyDict = {}
# Skip for specific features based on type
if not (featureType == 'Node' and feature == 'otype') and not (featureType == 'Edge' and feature == 'oslots'):
for nodeType in F.otype.all:
frequencyLists = featureMethod(feature).freqList(nodeType)
# Calculate the total frequency
if not isinstance(frequencyLists, int):
frequencyTotal = sum(freq for _, freq in frequencyLists)
else:
frequencyTotal = frequencyLists
# Calculate the number of entries
if not isinstance(frequencyLists, int):
numberOfEntries = len(frequencyLists)
else:
numberOfEntries = 1 if frequencyLists != 0 else 0
# Check the length of the frequency table
truncated = True if numberOfEntries > tableLimit else False
if not isinstance(frequencyLists, int):
if len(frequencyLists)!=0:
featureFrequencyDict[nodeType] = {'nodetype': nodeType, 'freq': frequencyLists[:tableLimit], 'total': frequencyTotal, 'truncated': truncated}
elif isinstance(frequencyLists, int):
if frequencyLists != 0:
featureFrequencyDict[nodeType] = {'nodetype': nodeType, 'freq': [("Link", frequencyLists)], 'total': frequencyTotal, 'truncated': truncated}
# Add processed feature data to the main dictionary
featureDict[feature] = {'name': feature, 'descr': featureDescription, 'type': featureType, 'datatype': dataType, 'freqlist': featureFrequencyDict}
########################################################
# MAIN FUNCTION #
########################################################
########################################################
# Gather general information #
########################################################
print('Gathering generic details')
# Initialize default values
corpusName = A.appName
liveName = ''
versionName = A.version
# Trying to locate corpus information
if A.provenance:
for parts in A.provenance[0]:
if isinstance(parts, tuple):
key, value = parts[0], parts[1]
if verbose: print (f'General info: {key}={value}')
if key == 'corpus': corpusName = value
if key == 'version': versionName = value
# value for live is a tuple
if key == 'live': liveName=value[1]
if liveName is not None and len(liveName)>1:
# an URL was found
pageTitleMD = f'Doc4TF pages for [{corpusName}]({liveName}) (version {versionName})'
pageTitleHTML = f'<h1>Doc4TF pages for <a href="{liveName}">{corpusName}</a> (version {versionName})</h1>'
else:
# No URL found
pageTitleMD = f'Doc4TF pages for {corpusName} (version {versionName})'
pageTitleHTML = f'<h1>Doc4TF pages for {corpusName} (version {versionName})</h1>'
# Overwrite in case user provided a title
if 'customPageTitleMD_' in globals():
pageTitleMD = customPageTitleMD
if 'customPageTitleHTML' in globals():
pageTitleHTML = customPageTitleHTML
########################################################
# Processing node features #
########################################################
print('Analyzing Node Features: ', end='')
for nodeFeature in Fall():
if not verbose: print('.', end='') # Progress indicator
processFeature(nodeFeature, 'Node', Fs)
if verbose: print(f'\nFeature {nodeFeature} = {featureDict[nodeFeature]}\n') # Print feature data if verbose
########################################################
# Processing edge features #
########################################################
print('\nAnalyzing Edge Features: ', end='')
for edgeFeature in Eall():
if not verbose: print('.', end='') # Progress indicator
processFeature(edgeFeature, 'Edge', Es)
if verbose: print(f'\nFeature {edgeFeature} = {featureDict[edgeFeature]}\n') # Print feature data if verbose
########################################################
# Sorting feature dictionary #
########################################################
# Sort the feature dictionary alphabetically by keys
sortedFeatureDict = {k: featureDict[k] for k in sorted(featureDict)}
# Print the sorted feature dictionary if verbose
if verbose:
print("\nSorted Feature Dictionary:")
for key, value in sortedFeatureDict.items():
print(f"Feature {key} = {value}")
print(f'\nFinished in {time.time() - overallTime:.2f} seconds.')
Gathering generic details Analyzing Node Features: .................................................................................. Analyzing Edge Features: ... Finished in 36.33 seconds.
Two types of pages will be created:
import os
import time
overallTime = time.time()
# Initialize a counter for the number of files created
filesCreated = 0
# Get the current working directory and append a backslash for path building
pathFull = os.getcwd() + '\\'
# Iterating over each feature in the feature dictionary
for featureName, featureData in sortedFeatureDict.items():
# Extracting various properties of each feature
featureDescription = featureData.get('descr')
featureType = featureData.get('type')
featureDataType = featureData.get('datatype')
# Initializing strings to accumulate HTML and Markdown content
nodeListHTML = nodeListMD = ''
tableListHTML = tableListMD = ''
frequencyData = featureData.get('freqlist')
# Processing frequency data for each node
for node in frequencyData:
# Building HTML and Markdown links for each node
nodeListHTML += f' <a href=\"featuresbynodetype.htm#{node}\">{node}</a>'
nodeListMD += f' [`{node}`](featuresbynodetype.md#{node}) '
# Starting HTML and Markdown tables for frequency data
tableListHTML += f'<h3>Frequency for nodetype <a href=\"featuresbynodetype.htm#{node}\">{node}</a></h3><table><tr><th>Value</th><th>Occurences</th></tr>'
tableListMD += f'### Frequency for nodetype [{node}](featuresbynodetype.md#{node})\nValue|Occurences\n---|---\n'
# Populating tables with frequency data
itemData = frequencyData.get(node).get('freq')
for item in itemData:
handleSpace = item[0] if item[0] != ' ' else 'space' # prevent garbling of tables where the value itself is a space
tableListHTML += f'<tr><td>{handleSpace}</td><td>{item[1]}</td></tr>'
tableListMD += f'{handleSpace}|{item[1]}\n'
tableListHTML += f'</table>\n'
# Add total of featuredata for this node type
total=frequencyData.get(node).get('total')
truncated=frequencyData.get(node).get('truncated')
# Correct handling of the truncated condition in both HTML and Markdown
if truncated:
truncatedNote = ' Note: table truncated.'
else:
truncatedNote = ''
tableListHTML += f'<p>Total frequency of feature: {total}.{truncatedNote}</p> '
tableListMD += f'\nTotal frequency of feature: {total}.{truncatedNote}\n '
# Creating info blocks for HTML and Markdown
infoBlockHTML = f'<table><tr><th>Data type</th><th>Feature type</th><th>Available for nodes</th></tr><tr><td><a href=\"featuresbydatatype.htm#{featureDataType}\">{featureDataType}</a></td><td><a href="featuresbytype.htm#{featureType}">{featureType}</a></td><td>{nodeListHTML}</td></tr></table>'
infoBlockMD = f'Data type|Feature type|Available for nodes\n---|---|---\n[`{featureDataType}`](featuresbydatatype.md#{featureDataType.lower()})|[`{featureType}`](featuresbytype.md#{featureType.lower()})|{nodeListMD}'
# Outputting in Markdown format
if typeOutput in ('md','both'):
pageMD = f'{pageTitleMD}\n# Feature: {featureName}\n{infoBlockMD}\n## Description\n{featureDescription}\n## Feature Values\n{tableListMD} {footerMD} '
fileNameMD = os.path.join(resultLocation, f"{featureName}.md")
try:
with open(fileNameMD, "w", encoding="utf-8") as file:
file.write(pageMD)
filesCreated += 1
# Log if verbose mode is on
if verbose: print(f"Markdown content written to {pathFull + fileNameMD}")
except Exception as e:
print(f"Exception: {e}")
break # Stops execution on encountering an exception
# Outputting in HTML format
if typeOutput in ('html','both'):
pageHTML = f'<html><head>{htmlStyle}</head><body><p>{pageTitleHTML}</p>\n<h1 id=\"start\">Feature: {featureName}</h1>\n{infoBlockHTML}\n<h2>Description</h2>\n<p>{featureDescription}</p>\n<h2>Feature Values</h2>\n{tableListHTML} {footerHTML}'
fileNameHTML = os.path.join(resultLocation, f"{featureName}.htm")
try:
with open(fileNameHTML, "w", encoding="utf-8") as file:
file.write(pageHTML)
filesCreated += 1
# Log if verbose mode is on
if verbose: print(f"HTML content written to {pathFull + fileNameHTML}")
except Exception as e:
print(f"Exception: {e}")
break # Stops execution on encountering an exception
# Reporting the number of files created
if filesCreated != 0:
print(f'Finished in {time.time() - overallTime:.2f} seconds (written {filesCreated} {"html and md" if typeOutput == "both" else typeOutput} files to directory {pathFull + resultLocation})')
else:
print('No files written')
Finished in 0.09 seconds (written 85 md files to directory C:\Users\tonyj\OneDrive\Documents\GitHub\parashot\Tools\BHSA_TF_features)
import os
import time
overallTime = time.time()
# Initialize a counter for the number of files created
filesCreated = 0
def exampleData(feature):
"""
This function checks if the specified feature exists in the global `featureDict` and if it
has a non-empty frequency list. If so, it extracts the first few values from this frequency
list to create a list of examples.
Parameters:
feature (str): The name of the feature for which examples are to be created.
Returns:
str: A string containing the examples concatenated together. Returns "No values" if the
feature does not exist in `featureDict` or if it has an empty frequency list.
"""
# Check if the feature exists in featureDict and has non-empty freqlist.
if feature in featureDict and featureDict[feature]['freqlist']:
# Get the first value from the freqlist
freq_list = next(iter(featureDict[feature]['freqlist'].values()))['freq']
# Use list comprehension to create the example list.
example_list = ' '.join(f'`{item[0]}`' for item in freq_list[:4])
return example_list
else:
return "No values"
def writeToFile(fileName, content, fileType, verbose):
"""
Writes provided content to a specified file. If verbose is True, prints a confirmation message.
This function attempts to write the given content to a file with the specified name. It handles
any exceptions during writing and can optionally print a message upon successful writing. The function
also increments a global counter `filesCreated` for each successful write operation.
Parameters:
fileName (str): The name of the file to write to.
content (str): The content to be written to the file.
fileType (str): The type of file (used for informational messages; e.g., 'md' for Markdown, 'html' for HTML).
verbose (bool): If True, prints a message upon successful writing.
Returns:
None: The function does not return a value but writes content to a file and may print messages.
"""
global filesCreated
try:
with open(fileName, "w", encoding="utf-8") as file:
file.write(content)
filesCreated+=1
if verbose:
print(f"{fileType.upper()} content written to {fileName}")
except Exception as e:
print(f"Exception while writing {fileType.upper()} file: {e}")
# Set up some lists
nodeFeatureList = []
typeFeatureList = []
dataTypeFeatureList = []
for featureName, featureData in sortedFeatureDict.items():
typeFeatureList.append((featureName,featureData.get('type')))
dataTypeFeatureList.append((featureName,featureData.get('datatype')))
for node in featureData.get('freqlist'):
nodeFeatureList.append((node, featureName))
###########################################################
# Create the page with overview per node type (e.g. word) #
###########################################################
pageMD=f'{pageTitleMD}\n# Overview features by node type\nOverview by [name](featuresbyname.md), [data type](featuresbydatatype.md), or [feature type](featuresbytype.md).\n'
pageHTML=f'<html><head>{htmlStyle}</head><body><p>{pageTitleHTML}</p>\n<h1>Overview features by node type</h1><p>Overview by <a href="featuresbyname.htm">name</a>, <a href="featuresbydatatype.htm">data type</a>, or <a href="featuresbytype.htm">feature type</a>.</p>'
# Sort the list alphabetically based on the second item of each tuple (featureName)
nodeFeatureList = sorted(nodeFeatureList, key=lambda x: x[1])
# Iterate over node types
for NodeType in F.otype.all:
NodeItemTextMD=f'## {NodeType}\n\nFeature|Feature type|Data type|Description|Examples\n---|---|---|---|---\n'
NodeItemTextHTML=f'<h2 id=\"{NodeType}\">{NodeType}</h2>\n<table><tr><th>Feature</th><th>Feature type</th><th>Data type</th><th>Description</th><th>Examples</th></tr>\n'
for node, feature in nodeFeatureList:
if node == NodeType:
featureData=featureDict[feature]
featureDescription=featureData.get('descr')
featureType=featureData.get('type')
featureDataType=featureData.get('datatype')
NodeItemTextMD+=f"[`{feature}`]({feature}.md#readme)|[`{featureType}`](featuresbytype.md#{featureType})|[`{featureDataType}`](featuresbydatatype.md#{featureDataType})|{featureDescription}|{exampleData(feature)}\n"
NodeItemTextHTML+=f"<tr><td><a href=\"{feature}.htm#start\">{feature}</a></td><td><a href=\"featuresbytype.htm#{featureType}\">{featureType}</td><td><a href=\"featuresbydatatype.htm#{featureDataType}\">{featureDataType}</a></td><td>{featureDescription}</td><td>{exampleData(feature)}</td></tr>\n"
NodeItemTextHTML+=f"</table>\n"
pageHTML+=NodeItemTextHTML
pageMD+=NodeItemTextMD
pageHTML+=f'{footerHTML}'
pageMD+=f'{footerMD}'
# Write to file by calling common function
if typeOutput in ('md','both'):
fileNameMD = os.path.join(resultLocation, "featuresbynodetype.md")
writeToFile(fileNameMD, pageMD, 'md', verbose)
if typeOutput in ('html','both'):
fileNameHTML = os.path.join(resultLocation, "featuresbynodetype.htm")
writeToFile(fileNameHTML, pageHTML, 'html', verbose)
####################################################################
# Create the page with overview per data type (string or integer) #
####################################################################
pageMD=f'{pageTitleMD}\n# Overview features by data type\nOverview by [name](featuresbyname.md), [node type](featuresbynodetype.md), or [feature type](featuresbytype.md).\n'
pageHTML=f'<html><head>{htmlStyle}</head><body><p>{pageTitleHTML}</p>\n<h1>Overview features by data type</hl>\n<p>Overview by <a href="featuresbyname.htm">name</a>, <a href="featuresbynodetype.htm">node type</a>, <a href="featuresbytype.htm">feature type</a></p>'
# Sort the list alphabetically based on the second item of each tuple (featureName)
dataTypeFeatureList = sorted(dataTypeFeatureList, key=lambda x: x[1])
DataItemTextMD=DataItemTextHTML=''
for DataType in ('Integer','String'):
DataItemTextMD=f'## {DataType}\n\nFeature|Featuretype|Available on nodes|Description|Examples\n---|---|---|---|---\n'
DataItemTextHTML=f'<h2 id=\"{DataType}\">{DataType}</h2>\n<table><tr><th>Feature</th><th>Feature type</th><th>Available on nodes</th><th>Description</th><th>Examples</th></tr>\n'
for feature, featureDataType in dataTypeFeatureList:
if featureDataType == DataType:
featureDescription=featureDict[feature].get('descr')
featureType=featureDict[feature].get('type')
nodeListMD=nodeListHTML=''
for thisNode in sortedFeatureDict[feature]['freqlist']:
nodeListMD+=f'[`{thisNode}`](featuresbynodetype.md#{thisNode}) '
nodeListHTML+=f'<a href=\"featuresbynodetype.htm#{thisNode}\">{thisNode}</a> '
DataItemTextMD+=f"[`{feature}`]({feature}.md#readme)|[`{featureType}`](featuresbytype.md#{featureType.lower()})|{nodeListMD}|{featureDescription}|{exampleData(feature)}\n"
DataItemTextHTML+=f"<tr><td><a href=\"{feature}.htm#start\">{feature}</a></td><td><a href=\"featuresbytype.htm#{featureType}\">{featureType}</a></td><td>{nodeListHTML}</td><td>{featureDescription}</td><td>{exampleData(feature)}</td></tr>\n"
DataItemTextHTML+=f"</table>\n"
pageMD+=DataItemTextMD
pageHTML+=DataItemTextHTML
pageHTML+=f'{footerHTML}'
pageMD+=f'{footerMD}'
# Write to file by calling common function
if typeOutput in ('md','both'):
fileNameMD = os.path.join(resultLocation, "featuresbydatatype.md")
writeToFile(fileNameMD, pageMD, 'md', verbose)
if typeOutput in ('html','both'):
fileNameHTML = os.path.join(resultLocation, "featuresbydatatype.htm")
writeToFile(fileNameHTML, pageHTML, 'html', verbose)
##################################################################
# Create the page with overview per feature type (edge or node) #
##################################################################
pageMD=f'{pageTitleMD}\n# Overview features by feature type\nOverview by [name](featuresbyname.md), [node type](featuresbynodetype.md), or [data type](featuresbydatatype.md).\n'
pageHTML=f'<html><head>{htmlStyle}</head><body><p>{pageTitleHTML}</p>\n<h1 id=\"start\">Overview features by feature type</hl>\n<p>Overview by <a href="featuresbyname.htm">name</a>, <a href="featuresbynodetype.htm">node type</a>, or <a href="featuresbydatatype.htm">data type</a></p>'
# Sort the list alphabetically based on the second item of each tuple (nodetype)
typeFeatureList = sorted(typeFeatureList, key=lambda x: x[1])
for featureType in ('Node','Edge'):
ItemTextMD=f'## {featureType}\n\nFeature|Datatype|Available on nodes|Description|Examples\n---|---|---|---|---\n'
ItemTextHTML=f'<h2 id=\"{featureType}\">{featureType}</h2>\n<table><tr><th>Feature</th><th>Data type</th><th>Available on nodes</th><th>Description</th><th>Examples</th></tr>\n'
for thisFeature, thisFeatureType in typeFeatureList:
if featureType == thisFeatureType:
featureDescription=featureDict[thisFeature].get('descr')
featureDataType=featureDict[thisFeature].get('datatype')
nodeListMD=nodeListHTML=''
for thisNode in sortedFeatureDict[thisFeature]['freqlist']:
nodeListMD+=f'[`{thisNode}`](featuresbynodetype.md#{thisNode}) '
nodeListHTML+=f'<a href=\"featuresbynodetype.htm#{thisNode}\">{thisNode}</a> '
ItemTextMD+=f"[`{thisFeature}`]({thisFeature}.md#readme)|[`{featureDataType}`](featuresbydatatype.md#{featureDataType.lower()})|{nodeListMD}|{featureDescription}|{exampleData(thisFeature)}\n"
ItemTextHTML+=f"<tr><td><a href=\"{thisFeature}.htm\">{thisFeature}</a></td><td><a href=\"featuresbydatatype.htm#{featureDataType}\">{featureDataType}</a></td><td>{nodeListHTML}</td><td>{featureDescription}</td><td>{exampleData(thisFeature)}</td></tr>\n"
ItemTextHTML+=f"</table>\n"
pageMD+=ItemTextMD
pageHTML+=ItemTextHTML
pageHTML+=f'{footerHTML}'
pageMD+=f'{footerMD}'
# Write to file by calling common function
if typeOutput in ('md','both'):
fileNameMD = os.path.join(resultLocation, "featuresbytype.md")
writeToFile(fileNameMD, pageMD, 'md', verbose)
if typeOutput in ('html','both'):
fileNameHTML = os.path.join(resultLocation, "featuresbytype.htm")
writeToFile(fileNameHTML, pageHTML, 'html', verbose)
####################################################################
# Create the page with an alphabetical overview of features #
####################################################################
pageMD=f'{pageTitleMD}\n# Overview features by name (alphabetical)\nOverview by [node type](featuresbynodetype.md), [feature type](featuresbytype.md), or [data type](featuresbydatatype.md).\n'
pageHTML=f'<html><head>{htmlStyle}</head><body><p>{pageTitleHTML}</p>\n<h1>Overview features by name (alphabetical)</hl>\n<p>Overview by <a href="featuresbynodetype.htm">node type</a>, <a href="featuresbytype.htm">feature type</a>, or <a href="featuresbydatatype.htm">data type</a></p>'
# Initialize Markdown and HTML strings
DataItemTextMD = '\nFeature|Feature type|Data type|Available on nodes|Description|Examples\n---|---|---|---|---|---\n'
DataItemTextHTML = '<table><tr><th>Feature</th><th>Feature type</th><th>Data type</th><th>Available on nodes</th><th>Description</th><th>Examples</th></tr>\n'
# Loop through the sorted dictionary
for featureKey in sortedFeatureDict:
featureDetails = sortedFeatureDict[featureKey]
feature = featureDetails.get('name')
featureDescription = featureDetails.get('descr')
featureType = featureDetails.get('type')
featureDataType = featureDetails.get('datatype')
nodeListMD = nodeListHTML = ''
for thisNode in featureDetails['freqlist']:
nodeListMD += f'[`{thisNode}`](featuresbynodetype.md#{thisNode}) '
nodeListHTML += f'<a href="featuresbynodetype.htm#{thisNode}">{thisNode}</a> '
DataItemTextMD += f"[`{feature}`]({feature}.md#readme)|[`{featureType}`](featuresbytype.md#{featureType.lower()})|[`{featureDataType}`](featuresbydatatype.md#{featureDataType.lower()})|{nodeListMD}|{featureDescription}|{exampleData(feature)}\n"
DataItemTextHTML += f"<tr><td><a href=\"{feature}.htm#start\">{feature}</a></td><td><a href=\"featuresbytype.htm#{featureType.lower()}\">{featureType}</a></td><td><a href=\"featuresbydatatype.htm#{featureDataType.lower()}\">{featureDataType}</a></td><td>{nodeListHTML}</td><td>{featureDescription}</td><td>{exampleData(feature)}</td></tr>\n"
# Close the HTML table
DataItemTextHTML += "</table>\n"
pageMD+=DataItemTextMD
pageHTML+=DataItemTextHTML
pageHTML+=f'{footerHTML}'
pageMD+=f'{footerMD}'
# Write to file by calling common function
if typeOutput in ('md','both'):
fileNameMD = os.path.join(resultLocation, "featuresbyname.md")
writeToFile(fileNameMD, pageMD, 'md', verbose)
if typeOutput in ('html','both'):
fileNameHTML = os.path.join(resultLocation, "featuresbyname.htm")
writeToFile(fileNameHTML, pageHTML, 'html', verbose)
# Reporting the number of files created
if filesCreated != 0:
print(f'Finished in {time.time() - overallTime:.2f} seconds (written {filesCreated} {"html and md" if typeOutput == "both" else typeOutput} files to directory {pathFull + resultLocation})')
else:
print('No files written')
Finished in 0.01 seconds (written 4 md files to directory C:\Users\tonyj\OneDrive\Documents\GitHub\parashot\Tools\BHSA_TF_features)
Changes with version 0.5.1:
Minor changes with version 0.5:
Changes to previous major version (0.4:
Licenced under Creative Commons Attribution 4.0 International (CC BY 4.0)