Structured Diagram Authoring¶

Structured authoring into OU XML provides a way of separating content from presentation in two ways:

content can be rendered to different output formats, such as HTML, or PDF documents;
content can be styled in different ways within the same format; for example, different stylings may be applied to different web pages without having to change the underlying content in any way.

In this note, I will review some of the ways in which the generation of diagrams, charts and tables can be treated in a similar way, with content descriptions being processed in order to generate an output publication, in much the same way the structured authoring process takes an OU XML representation of the content and then generates an output publication from it (for example, an HTML or PDF document styled in a particular way).

Displaying a block diagram¶

Several tools are available for generating block diagrams from simple textual statements about the relationships that hold between the blocks.

For example, the blockdiag tool supports the creation of diagrams of the following form:

The digrams are created from a textual description of them (link to live example]:

blockdiag {

  // branching edges to multiple children
  A -> B, C;

  // branching edges from multiple parents
  D, E -> F;

  //Flow control
  P -> Q -> R -> S -> T;
  // fold edge at R to S (S will be laid out at top level; left side)
  R -> S [folded];
}

We can place more complex labels into the boxes, either by linking quoted phrases directly, or by associating a label with an block element. For example:

blockdiag {
  A [label='First Box'];
  B [label = 'Second Box'];
  C [label = 'Third Box'];

  // branching edges to multiple children
  A -> B, C;
}

Link to live example.

By using explicit label assignments, we can retain the structure of the diagram whilst changing it's labeling, for example to support internationalisation by translating labels to another language.

Examples of more complex block diagrams and associated ways of styling them can be see on the blockdiag examples page.

The output of the blockdiag generation process is an SVG file that can be loaded into an SVG editor (PNG output is also supported). This means that the blockdiag tool could be used to generate a "skectch" drawing that an artist could thenn further refine, as well as finished artwork.

The developer of the blockdiag tool has also produced several other diagram generators, such as sequence diagrams, activity diagrams or different sorts of diagrams typically used to describe various aspect of communication networks.

Generating Charts¶

In the same way that OU structured authoring supports the separation of content from presentation whilst still allowing us to retain a well defined mapping between them, so too can we separate out the presentation or styling of a chart from the chart itself, as well as the chart from the data that created it.

The following example shows how we can retrieve a simple data file and then display it in a variety of ways. The "structured authoring" code fragments are all that is required to generate the various outputs.

In [59]:

#Load in a tool that makes it easy to load in a datafile
import pandas as pd

#Load in some data from a file
rawdata=pd.read_csv('dummy.csv')
rawdata

Out[59]:

	x val	y val
0	1	3
1	2	4
2	3	6
3	4	10
4	5	8

Note that we can define different themes to apply to the data table, as well as picking out individual cells to highlight (see examples).

In [58]:

#Use the itable package to help us style the output of a data table

#!pip3 install itable
from itable import *

PrettyTable(rawdata, tstyle=TableStyle(theme="theme1"))

Out[58]:

x val	y val
1	3
2	4
3	6
4	10
5	8

Now let's generate some charts from the data. The approach demonstrated is based on the notion of The Grammar of Graphics by Leland Wilkinson, which has been implemented in the R and (partially) python programming languages. Other grammar based approaches to describing charts are available!

In [16]:

#Load in a toolkit that helps us generate charts
from ggplot import *

#Create a chart object associated with the data
#For a simple chart, just map the names of selected data columns onto the chart axes we want to display them on
g=ggplot(rawdata,aes(x='x val',y='y val'))

The chart object contains the dataset we wish to chart, and the axes we wish to map data elements onto, but it does not yet identify what sort of chart we wish to display or how we wish to display it.

In [3]:

#We can now say what sort of chart we want to generate from the chart object
#For example, a line chart
g + geom_line()

Out[3]:

<ggplot: (8789487752182)>

In [4]:

#Or a scatterplot
g + geom_point()

Out[4]:

<ggplot: (8789468785854)>

In [65]:

#We can also annotate the chart in a textual way
labelled_chart = g  + geom_line() \
                    + ggtitle("The title of my chart") \
                    + xlab("Updated x-axis label") \
                    + ylab("Updated y-axis label") 
            
labelled_chart            

Out[65]:

<ggplot: (8789467104527)>

As well as having generated a particular chart type (line chart, or scatterplot, for example) from the actual chart object that contains the data, we can also style a generated chart in much the same way that we mightg style an HTML document.

In [6]:

labelled_chart + theme_bw()

Out[6]:

<ggplot: (-9223363247386506773)>

In [10]:

labelled_chart + theme_538()

Out[10]:

<ggplot: (-9223363247386771505)>

In [68]:

#We can firther annotate the chart if required - for example, setting axis limits
labelled_chart + theme_seaborn() + xlim(0,6) + ylim(0,15)

Out[68]:

<ggplot: (-9223363247386761959)>

As with many drawing programmes, we can also add additional layers on to a chart.

In [84]:

labelled_chart + theme_seaborn() + xlim(0,6) + ylim(0,15) \
               + geom_point(aes(color='red',size=50))

Out[84]:

<ggplot: (8789467085099)>

We can also generate interactive HTML versions of simple charts from the rendered chart objects

In [42]:

#The mpld3 utility can generate HTML charts from chart objects
import mpld3

#All the charts we generate from now on will be interactive HTML charts
mpld3.enable_notebook()
g + geom_point()

#If you hover over the chart, you should notice a popup menus appear in the to the bootom left of the chart.
#Click on the magnifying glass and you can select an area of the chart to zoom inot
#Click on the large + icon to drag the chart around.
#Click on the house/home symbol to reset the chart

Out[42]:

<ggplot: (-9223363247387662704)>

In [66]:

labelled_chart

Out[66]:

<ggplot: (8789467104527)>

The mpld3 package is extensible and supports plugins, as demonstrated on the mpld3 examples page. For example, one plugin supports tooltips that pop up the value of a point in a scatter plot when a user hovers their cursor over it.

Rationale for Using Structured Diagram Authoring¶

In the structured-authoring-for-text approach,the separation of content and presentation means that content can be generated once and then rendered and/or styled in multiple different ways for different devices or different styles of presentation.

When it comes to structured-authoring-for-diagrams, the structured description of the diagram is provided that is then rendered as a diagram using an automated layout algorithm, and then further styled using an appropriate theme.

If the automated layout engine produces a diagram that is not suitably layed out, the diagram can typically be saved in a format that allows it to be edited in a traditional drawing package. In a sense, the automated production of the diagram might be thought of as a "born digital sketch" of a diagram that can be created in logical terms by an author and then finished off by an artist.

As abstract representations of charts get more complete. Using the python programming language as a basis for generating diagrams, tools such as mpld3 support the versioning of the same chart object into differently styled and functioned formats. For example, the same chart object could be rendered as a high quality print style image for use in a PDF using an OU house style, or converted to an interactive HTML chart for including in online materials. In the R language - which has a more complete implementation of the ggplot grammar of graphics language than the python version used above, the RCharts library provides support for automatically generating interactive HTML5 charts from R ggplot chart objects.

Maintenance of diagrams is supported at the abstract, definitional layer, as well as the presentational layer. For example, in a block diagram, the inclusion of an additional block, or removal of a block, would be achieved by modifying the structured, written version of the diagram and then re-rendering it.

Ideally, any and all text included in the final diagram should be present in the structured, written version of it - the rendered version of the diagram is then generated from the original structured written version. In addition, styling requirements should also be specified in the authored version of the diagram, rather than being applied in an image editor.

Rather than a handover of an image proceeding by means of an author producing a hand drawn sketch and a text file containing the required text, handovers may take the form of a structured written version of the diagram, with a rendering of it as a sketch. If necessary, an author could also annotate a printout of the rendered version of the chart with additional comments, for example regarding styling or features that could not find a way of writing into their structured version of it.

In the same way that edits to material in the VLE should take place at the OU XML level, rather than to rendered HTML, changes to diagrams should not take place at the rendered level but at the structured written level. In line with structured authoring practice, styling and semantic elements should be separated where possible. For example, rather than specify a particular colour to emphasise a block in an block diagram, a particular sort of emphasis might be defined which is then mapped on to a colour. This is akin to the conventional mapping of emphasis and strong elements in HTML onto italic and bold styling, for example.

When generating charts for use in course materials, the structured written approach supports veracity of the materials, as well as maintenance of them, by deriving charts from the actual source data, rather than allowing human judgement to place data points on the chart. Once again, we have a model of machines generating chart layout from data, and then the further application of style on top of the chart (for example, whether gridlines are shown or not, how labels are displayed, etc). Generating charts from data also means that accessibility support tools my be built on top of the actual data used to generate any visual depictions of it.