Notebook

Guide to Computational Publishing¶

By Adam Mazel, Digital Publishing Librarian, Indiana University ¶

What is Computational Publishing?¶

Computational publishing (CP) is a recent (2017-present) and often free means of creating and disseminating digital text and computer code that enhances reader interactivity. It enables you to write and publish on the web human-readable text alongside machine-readable computer code that can also be run, modified, and created by the reader. Thus, your reader can do more than just read your code or even run your code--in CP, your reader can also modify your code and even write new code alongside your own. Therefore, CP readers can also be CP writers, particularly of programming languages.

Examples¶

Examples of Computational Publishing include:

Aesthetic Programming by Winnie Soon & Geoff Cox

This digital textbook offers a sidebar to enable reader interactivity: code running, revision, and creation.

Introduction to Colab and Python by The TensorFlow Authors

This guide is created in Google Colaboratory (discussed below) to mix human-readable text with machine-readable Python code that can be run, revised, and created by its reader. These actions can modify the original publication (if those privilges are granted to readers).

Python for Computational Science and Engineering by Hans Fangohr

This digital textbook uses the software Binder (discussed below) to enable the mixing human-readable text with machine-readable python code that can be run, revised, and created by its reader. These actions cannot modify the original publication.

Journal of Digital History

The JDH also uses Binder to publish data-driven scholarship.

Further Example¶

This guide also uses Binder to exemplify CP:

Reader, run the code cell below by clicking in it and then pressing "Run" in the menu above. Then, hange the variable "a" to a new number (ex. a = 6), and run the cell again.

In [1]:

a = 10
print(a-5)

Why Computational Publishing?¶

Because one's audience can now execute author-written code to see its result, CP can facilitate evaluations of the data science's reproducibility: readers can now test the writer's code. Moreover, because CP allows readers to modify code (for example, by inputing different queries), CP enables readerly exploration and inquiry. And lastly, because CP allows readers to write their own code--say in response to a prompt by the writer--CP thus is apt for the creation and completion of educational exercises and activities, particularly in code-based instruction, such as in computer and data science fields. CP therefore is an apt medium for both reproducible research and pedagogy.

Computational Publishing for Educational Materials¶

CP can be used to create and web publish interactive, data-driven educational materials, particularly textbooks, assignments / activities, lecture notes, etc. Assignments, for instance, be created, shared, and distributed by faculty; completed and returned by students; and commented on and auto- and manually graded in turn. Moreover, CP can be used to create what are known as Open Educational Resources (OER): educational materials that are free of cost and most copyright restrictions, so others can reuse them and / or create revised versions of them.

How is Computational Publishing Done?¶

Computational Publishing can be done with various software. But its software stack is often composed of some combination of:

Markdown is a markup language for formatting plain text.
Jupyter Notebooks and JupyterBook are software for writing and sharing on the web documents that mix text--formatted via Markdown--with runnable code in over 40 programming languages, including Python and R. The former is for creating and sharing documents of plain text and code, the latter is for creating and sharing books of plain text and code.
GitHub is a web-based repository and interface for storing and sharing code and files, such as Jupyter Notebooks.
Binder enables authors and readers to write, run, and share / publish Jupyter Notebooks on the web via GitHub without having to download, install, or run anything.
JupyterHub enables Jupyter Notebooks to be run, modified, and shared / published on the web without the reader having to download, install, or run anything. JupyterHub hosts Jupyter notebooks on a server with multiple users whose access can be controlled via authentication.
Google Colaboratory enables authors and readers to write, run, and share / publish Jupyter Notebooks on the web via Google Drive without having to download, install, or run anything.
Microsoft Azure Lab Services enables Jupyter Notebooks to be run, modified, and shared / published on the web without the reader having to download, install, or run anything.
Docker creates the virtual environment that contains / provides all the software and data necessary for the code in the publication to execute so that CP readers can run and write code in computational publications without having to download, install, or run anything.

Docker--and Binder, JupyterHub, Colab, and Azure--obviate the need for downloads or installs for code to be executable and writable by readers by providing all of the necessary software and data for the code to run. For example, if the computational publication employs Python, readers do not need to have Python installed on their computer to read, execute, or modify the code in the publication. This provision of software in a virtual environment is what enables readers to interact with the published text and code.

Which Software? Pros / Cons¶

Binder is free of charge. It ensures user privacy. It does not (on its own) allow reader changes to be saved. Additional software requirements are loaded via requirements files. It has more of a learning curve.
Google Colab is free of charge. As it is a Google product, user data may be reused or sold. does allow readers to save and share their changes. Additional software requirements are loaded manually in the environment. It has less of a learning curve.
Microsoft Azure is not free of charge but costs money. It has the highest learning curve.

CP Software Particularly Apt for Classroom Use¶

Google Colaboratory
JupyterHub
nbgrader is a tool for creating, grading, and commenting on Jupyter Notebook-based assignments

Overviews of Classroom Tools / Uses¶

This Jupyter book overviews how Jupyter Notebooks can be used in educational / classroom contexts
This video overviews how JupyterHub and Jupyter Notebooks can be used in educational / classroom contexts.
This video introduces nbgrader.

How to Create a Computational Publication via Binder¶

Binder How-To Guide

Steps¶

Create a Jupyter Notebook (JN) containing both text and code. If you do not already have JN on your computer, install it here.
Go to GitHub (GH). If you do not have a GH account, create one. If you do (or after you do), create a new repository on GH. The repo should be public, not private, and add a README file. You may wish to update your README with information on how to use your CP.
Upload your JN into your GH repo by clicking "Add File" and committing it to the repo's Main Branch.
If your JN requires additional software--dependencies, libraries, etc.--to run, you will also need to upload into your repo a dependency file.Then commit to the main branch.
Go to https://mybinder.org. Type your repo's URL into the “GitHub repo or URL” box. It should look like this:

YOUR-USERNAME/YOUR-REPO-NAME

As you type, the webpage generates a link in the “Copy the URL below…” box. It should look like this:

https://mybinder.org/v2/gh/YOUR-USERNAME/YOUR-REPO-NAME/HEAD

To ease sharing your CP, create a Binder badge by coping the Markdown or ReStructured Text snippet into your README.md file. This snippet will render a badge that people can click. Click "Launch"

When complete, visit the URL above. You will see a “spinner” as Binder launches the repo. If everything ran smoothly, you’ll see a JupyterLab interface. Click the Notebook button to open the JN.

How to Create a Computational Publication via Google Colaboratory¶

Google Colab How-To Guide

How to Create a Computational Publication via Microsoft Azure¶

Microsoft Azure How-To Guide

DOIs for Computational Publications¶

Digital Object Identifiers (DOIs) are persistent identifiers for digital objects published on the web, from websites to e-books to computational publications. They make the publication easier to find, share, and access by providing metadata and solving the problem of broken links.

To secure a DOI for your computational publication, contact Ethan Fridmanski, Data Services Librarian, IU Libraries