Computational Storytelling
in the Liberal Arts

Douglas S. Blank

Bryn Mawr College
Department of Computer Science

Funding provided by the Association of American Colleges & Universities' Teaching to Increase Diversity and Equity in STEM (TIDES)

1. Standing on the Shoulders of Giants

With colleagues, have been working on languages, programming, environments, and education for 20+ years.

  • Laura Blankenship, The Baldwin School (K-12)
  • Paul Grobstein, Bryn Mawr College, Biology
  • Keith O'Hara, Bard College
  • Jennie Kay, Rowan University
  • Jim Marshall, Sarah Lawrence College
  • Lisa Meeden, Swarthmore College
  • Mark Russo, The College of New Jersey
  • Steven Silvester (aka blink1073), Continuum Analytics
  • Mark Matlin and Liz McCormack, Bryn Mawr College, Physics
  • Josh Shapiro, Bryn Mawr College, Biology
  • many, many students over the years

1.1 Storytelling at Bryn Mawr College

  • Paul Grobstein (1946 - 2011), BMC Biology
  • Paul thought a lot about storytelling
  • Paul was fond of saying that science is not about "getting it right", but "getting it less wrong"

1.2 Project Jupyter: Computational Narratives as the Engine of Collaborative Data Science

  • Fernando Perez, Lawrence Berkeley National Lab, and UC Berkeley
  • Brian E. Granger, Cal Poly San Luis Obispo

The first time a met Fernando, I mentioned that the notebook must have a spelling checker. So that was our first contribution. (BTW, the world's first spelling checker was written by a Bryn Mawr College graduate, Martha Evens).

2. Courses Taught with Jupyter

  • First public deployment of JupyterHub
  • Use JupyterHub, not student installations
  • Smaller class sizes (15 to 40 students)
  • But other schools use it with 100's of students!
  • My BMC classes are typically 95% women

  • Instead of attempting to quell fears, or explain what we will do, we just do it
  • Usually are programming on Day 1 (within 20 minutes), even in Introduction to Computing

2.1 Languages (Kernels) used at BMC

  • Scheme
  • Assembly Language
  • Processing
  • Java 9
  • Firstyear writing seminar (Python, but only to call Google Charts)
  • Python

2.1.1 Developed Metakernel

  • Common magics (meta-commands, starts with percentage)
  • Clear separation of meta-commands from language
  • Provides magics, parallel processing, and shell for all languages
  • Popular Metakernel Languages include: Octave, Matlab, Scheme, Prolog, Processing, Java9, Xonsh, ROOT, Gentoo Science Bash, Cling, Wolfram, and more!
In [2]:
Available line magics:
%activity  %cd  %connect_info  %conversation  %dot  %download  %edit  %get  %help  %html  %include  %install  %install_magic  %javascript  %jigsaw  %kernel  %kx  %latex  %load  %ls  %lsmagic  %macro  %magic  %matplotlib  %parallel  %plot  %pmap  %px  %python  %reload_magics  %restart  %run  %scheme  %set  %shell

Available cell magics:
%%activity  %%brain  %%conversation  %%debug  %%dot  %%file  %%help  %%html  %%javascript  %%kx  %%latex  %%macro  %%pipe  %%processing  %%px  %%python  %%scheme  %%shell  %%show  %%simulation  %%time  %%tutor

3. Storytelling in the Classroom

  • All lectures notes are notebooks
  • All student handins are notebooks
  • Everything is a notebook!

Need to force students to write text. Very different from writing small, short comments.

When used in the writing seminar, students forgot how to write. The notebook interface was a shock. I was hoping to build on their knowledge of writing text/markdown to bootstrap into "computational thinking."

I have a vision that our firstyear writing seminar could embrace some data visualization, if not computational thinking. Imagine elevating computing the the level of reading and writing. For everyone!

3.1 Diversity and Equity

  • Can put best practices into modules/notebooks
    • Read and write about good role models
  • All code runs on the server with JupyterHub
    • If a student has an older computer, it doesn't matter too much
  • Can work on tablets, and inexpensive computers
  • Students will connect computing to their own lives, if you let them write about it. You have to read it, though.

3.2 Lab Assignments

  • Similar in spirit to Process Oriented Guided Inquiry Learning (POGIL, also a part of TIDES)
    • Extensive step-by-step text and tasks
  • Can also include "worked examples"

Telling a story comes more naturally to Data Science (including Physics and Biology). Not so much for computer science students.

3.3 Additional Pedagogical Benefits

  • "Exit tickets" (audience participation system, like "clickers")
  • Live coding
  • No separation between lecture code and live code (found many mistakes when moving to notebooks)
  • Made up a fake language/simulation for first week in Computing Through Biology
  • Can connect to popular web service, such as Python Online Tutor
  • Play Jeopardy!
In [5]:

numbers = list(range(10))
for i in numbers:
    numbers[i] += 1
In [44]:
%activity /home/dblank/activities/activity3
Received: Results
In [43]:
%activity /home/dblank/activities/learning

3.4 NB Extensions

  • Publish
  • Submit
  • Spelling Checker
  • Move Section Up/Down
  • Number Sections
  • Table of Contents
  • Generate References
  • Tabbed In/Out
  • Two-column In/Out
  • Drawing Annotations

Jupyter is very hackable. JavaScript on the front end; Python on the backend.

3.5 NBGrader

  • Rough, but getting better
  • Most useful as means of distributing materials and collecting notebooks
  • No convenient method to return graded notebooks
  • I generally printout notebooks, and write comments on paper
    • Printing notebooks generally gives terrible output

4. Lessons Learned

Students can't write using the notebook without explicit instructions. Making analogies with writing can build on what they know about text (cut and paste, top to bottom, etc.) Student writing can give insight into their lives.

Physics and Biology professors are interested in different stories than computer scientists.


In [30]:
data = [[1, 2, 3, 8],
        [4, 5, 6, 7],
        [5, 2, 7, 2]]

Non-CS instructors likes to write:

In [35]:
import numpy as np
In [36]:
np.mean(data, axis=0)
array([ 3.33333333,  3.        ,  5.33333333,  5.66666667])

Computer scientists like to write:

In [41]:
def mean(data):
    avgs = [0 for i in range(len(data[0]))]
    for row in range(len(data)):
        for col in range(len(data[row])):
            avgs[col] += data[row][col]
    avgs = [num/len(data) for num in avgs]
    return avgs
In [42]:
[3.3333333333333335, 3.0, 5.333333333333333, 5.666666666666667]

Why the difference? Computer scientists want to tell complexity stories, and see the loops. Non-CS users often just want to get the answer, and get it fast, in order to tell the resulting story. This translates into learning library interfaces, rather than general techniques. They are different stories, and there are pedagogical consequences.

Computer Scientists typically hate the notebook. Why? Mixes up time and dependencies. Very different from the idea of an IDE. State is hidden, and dependencies can be broken.


x = 56 + 78

If you then edit that cell to:

y = 56 + 78

But x is still defined!

A general CS resistance to notebooks:

  • Overheard: "Jupyter is too convenient" for the student
  • Some fellow CS educators hate Jupyter as an IDE
    • Twitter comment: "Ipython4experts, not4students"
    • "within 1 hour later, indecipherable errors and broken notebooks"
    • It is different, I say better. How to migrate educators?

5. Future Work

Jupyter is a seed. It needs thoughtful educators to help build it into a pedagogical platform.

JupyterLab is on the horizon. Can we keep all of the benefits of the notebook?