The Art of Literary Text Analysis (ALTA) has three objectives.
First, to introduce concepts and methodologies for literary text analysis programming. It doesn't assume you know how to program or how to use digital tools for analyzing texts.
Second, to show a range of analytical techniques for the study of texts. While it cannot explain and demonstrate everything, it provides a starting point for humanists with links to other materials.
Third, to provide utility notebooks you can use for operating on different texts. These are less well documented and combine ideas from the introductory notebooks.
This instance of The Art of Literary Text Analysis is created in Jupyter Notebooks based on the Python scripting language. Other programming choices are available, and many conceptual aspects of the guide are relevant regardless of the language and implementation.
Jupyter Notebooks was chosen for three main reasons:
Python (the programming language used in Jupyter Notebooks) features extensive support for text analysis and natural language processing;
Python is a great programming language to learn for those learning to program for the first time – it's not easy, but it represents a good balance between power, speed, readability and learnability;
Jupyter Notebooks offers a literate programming model of writing where blocks of prose text (like this one) can be interspersed with bits of code and output allowing us to use it to write this guide and you to write up your experiments. The Art of Literary Text Analysis focuses on the thinking through of analytical processes, and the documentation-rich format offered by Jupyter Notebooks is well-suited to the nature of this guide and to helping you think through what you want to do.
This guide is a work in progress. It was developed over the Winter of 2015 in conjunction with a course on literary text mining at McGill. It has been forked and extended for a course in the Winter of 2016 on big data and analysis at the University of Alberta. Here is the current outline: