Notebook

0. Introduction¶

Course overview¶

This tutorial aims to teach the basics of python programming with a focus on its application in computational biochemistry. Assuming no previous knowledge of python, we start by explaining the basic building blocks (e.g. variables, conditionals, and lists) and build upon these to show examples of how python can be used to solve problems related to computational biochemistry (e.g. analysis of Molecular Dynamics simulations).

The tutorial does not aim to completely cover every aspect of python programming, this unfortunately would not be feasible in a half day session. Instead we encourage attendees to use this as a jumping point into learning more about python and its many applications.

We recommend the following useful resources for python:

Python Documentation (tutorials, library reference, ...)
CodeAcademy Python tutorial

Course structure¶

The tutorial is thought across a series of jupyter notebooks:

Introduction
Basic Types
Variables
Conditionals
Loops
Lists
File Handling
Functions
Documentation
Libraries
Numpy
Object Orientated Programming (OOP)
Protein Analysis

These notebooks are contained within their own labelled/named directories. Within these directories you will also sometimes find datafiles sub-directories that contain all the relevant data to run the notebook, in addition to copies of the notebooks with the exercise solutions (whose name ends in _solutions.ipynb).

We recommend that each section be followed in order, however those with prior experience with python may wish to jump to the later sections.

Launching the notebooks¶

Please see the setup.md "Starting the tutorial" instructions.

Briefly, using a terminal (Anaconda prompt in Windows) do the following:

You will first need to activate the conda environment: conda activate OxPython
Then you will need to start the jupyter notebook server: jupyter notebook
From there navigate to the desired directory and click on the .ipynb notebook you wish to open.
Solutions for some of the notebooks are provided under <notebook_name>_solutions.ipynb, we recommend you only open these once you have finished going through a notebook.
To close the notebooks, click on "File" -> "Close and Halt".
You may then also want to quit the jupyter notebook server altogether.

Using the notebooks¶

Jupyter notebooks are a useful way to demonstrate python code in a clear and organised manner. Each notebook is composed of a series of cells which can either contain text (such as this one) or code (which will usually have the words In []: next to them). You can type in code cells by clicking on them and typing whatever changes you want to make.

Code cells can be used to run any valid python code. To do this, simply click on the cell and press the Run button in the toolbar above, or type Shift+Enter.

Important note: All code executed is retained in the memory of the notebook. That means that if you import a module or declare a variable, these will be seen and can interact with code executed in cells further down the notebook. There are many places in this tutorial where modules are imported early on in the notebook and used in subsequent code cells. We therefore recommend that tutorial users do not jump to later parts of the notebook without doing the earlier ones.

Notebooks can be cleared to their original state by navigating to Kernel > Restart & Clear Output. A further prompt will ask you if you wish to clear all the outputs, pressing on this will return the notebook to a state where no code cells have been run.

0.1 A first command in python:¶

Let's start things off with executing our first python command. In the cell below we will use print to write out "Hello world". Try it out by clicking on the cell and pressing Run in the toolbar above.

In [ ]:

print('Hello world')

The print function takes an input value (some text, in quotation marks) and prints the text as output (without the quotation marks). One should note that Python is case-sensitive - ie. we can't use Print or PRINT.

Whilst print is a very simple function, it is one of the most useful. It allows us to write out any information relevant to our python code. This can be anything from usage instruction to the output of some calculation.

Note: As of python 3, the inputs to the 'print' function must be placed within parantheses. If you use code developed for python 2.7 or lower, you may notice that this requirement is not followed and therefore will cause issues if run using python 3+. There are other key differences between python 3 and previous versions, some of which will be covered in this tutorial. For a more detailed look into them, the following resource may be of use: Porting Python 2 Code to Python 3. As python 2.7 will reach end of life as of January 2020, we strongly recommend that no new code be written in python 2.7.

Exercise 0.1.1¶

In the cell below, use print to output the phrase "This is the second command".

In [ ]:

# Exercise 0.1.1
print('This is the second command')

A note on python comments¶

You will have noticed that the "Exercise 0.1.1" in the above box does not affect the notebook output. In python, in-line code comments are indicated using the # symbol. Any text that comes after this is considered to be a comment and is ignored by the python interpreter. It is considered good practice to comment your code in order to make it more readable for yourself and future users. Other means of commenting python code, such as docstrings, also exist and will be briefly covered in section 8 of this tutorial.

0.2 Python scripts¶

Whilst jupyter notebooks can be useful for demonstrating python code and prototyping new ideas, the main way in which to run python code is to execute a script. To do this, you need to write the python instructions within a file and execute it via the python interpretor.

For example if you were to write the following in a file (let's assume we call it hello.py):

print("Hello world")

And then type python hello.py in a terminal, this should print out Hello world.

In the remainder of this tutorial we will, for convenience, focus on executing code within jupyter notebooks. However, it is worth remembering that using written scripts/code is often a much more reproducible and manageable way to use python.

Python: An Interpreted Language¶

Python is an interpreted programming language. This means that Python code is read and executed line by line by the Python interpreter (the program that transforms Python code you wrote into something that your computer can understand and run). This is different from compiled languages (such as FORTRAN or C/C++), where all the code is first translated by the compiler to something the computer can understand and then run.

Why Python?¶

As as an interpreted language, Python is usually slower than compiled languages such as FORTRAN and C++. However, Python is extremely flexible, can be made reasonably fast under the hood, and it is relatively easy to run on different operating systems. This is the reason why it popularity exploded in recent years, especially in computational and data sciences: its clean syntax allows to write clear and concise programs, while sacrificing speed to a minimum.

In computational sciences there are other commonly used interpreted languages such as Matlab and R. Matlab's interpreter is not free and open source, so you won't be able to run your code unless you keep paying for a licence. R is a fantastic language for statistical computing, but it is very domain-specific. Python, in contrast, is free, open source and has a wide domain of applicability and it is therefore a very powerful language to know, for beginners and experts alike.

Review¶

In this tutorial section we have:

Detailed the contents of the tutorial.
Shown how to execute and use a jupyter notebook session.
Used a function, print.
Written our first python script.
Run the script using the python interpretor.
Discussed the advantages and disadvantages of Python