Data Analysis with Jupyter Notebooks.

Tutorial 4

Benjamin J. Morgan, University of Bath.

Plotting data with matplotlib

To plot data we use another module: matplotlib This is a very powerful (and complicated) plotting library, that be used for quick analysis of experimental data, or to generate publication quality figures. It supports an enormous number of plot types. We are going to start with simple 2D $x,y$ plots.

import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
In [ ]:
 

The import statement loads up the part of the matplotlib library we will use for plotting, and lets us refer to this as plt for convenience later.

The %matplotlib inline command tells the Jupyter notebook that we want all out “plots” to appear “inline”, i.e. inside the notebook (alternatives include opening the plots in other windows, or saving them as graphics files). The % symbol at the start means this is a “magic” command for controlling the behaviour of this Jupyter notebook, and is not standard Python.

If you are using a high resolution screen, you will also want to switch on high resolution figures.

%config InlineBackend.figure_format = 'retina'

We also import numpy as np so that we can store our data as arrays.

Creating a plot uses plt.plot(). Remember, we have assigned plt as shorthand for matplotlib.pyplot.

# plot the numpy arrays a and b against each other
import numpy as np
a = np.array( [ 1, 2, 3, 4 ] )
b = np.array( [ 5, 6, 7, 8 ] )
print( "a:", a )
print( "b:", b )
plt.plot( a, b )
plt.show()
In [ ]:
 

This can be used for plotting $y$ as a function of $x$, e.g. $y=x^2$.

x = np.array( [0, 1, 2, 3, 4, 5] )
y = x**2
plt.plot( x, y )
plt.show()
In [ ]:
 
Plot $y=\sin(x)$ for $x=0$ to $2\pi$.
To generate an array from 0 to $2\pi$ remember you can use numpy.linspace()
In [ ]:
 

The default plot shows a connected line. To plot individual points, we can add a third argument to plt.plot() that specifies the appearance for that data set:

plt.plot( x, y, "o" )
plt.show()
In [ ]:
 

A large number of marker types exist in matplotlib (a full list is here).

We can also control the line style, and combine code controlling marker and line appearance.

plt.plot( x, y, ":" ) # dotted line
plt.show()
In [ ]:
 
plt.plot( x, y, "s:" ) # dotted line with squares
plt.show()
In [ ]:
 

Adding axes labels and a title uses the xlabel(), ylabel(), and title commands.

Edit the previous cell to include the following lines between plt.plot() and plt.show():

plt.xlabel( 'x' )
plt.ylabel( 'y^2' )
plt.title( 'y = x^2' )

Rerun the code cell to regenerate the plot, now with labelled axes and a title.

Plotting multiple data sets on the same graph uses multiple plot() commands. For an example, let us create three numpy arrays, u, v, and w.

# create three numpy arrays, u, v, and w
u = x + 1
v = x ** 2
w = np.sqrt( (x*2)+1 )
print('u = ',u)
print('v = ',v)
print('w = ',w)
In [ ]:
 

Now we can plot $u$, $v$, and $w$ versus $x$ on the same figure.

plt.plot( x, u, 'o-',  label='x+1' )
plt.plot( x, v, 'x--', label='x**2' )
plt.plot( x, w, '*:',  label='sqrt((x*2)+1)' )
plt.xlabel( 'x' )
plt.ylabel( 'y' )
plt.title( 'y=f(x)')
plt.legend()
plt.show()
In [ ]:
 

We have assigned text labels for each data set by setting label=string in each plt.plot() command. These labels are then shown in the legend produced by the plt.legend() command.

Plot $sin(x)$ and $cos(x)$ for $x=0$ to $2\pi$. Label the curves and the $x$ axis.
In [ ]:
 

Nearly every part of the plot appearance can be controlled. Two further examples are line colours and thickness. A number of line colours are predefined and can be referred to with a corresponding string.

In [ ]:
# run this cell
plt.plot( x, u, 'o-',  label='x+1',           color='salmon',          linewidth=3 )
plt.plot( x, v, 'x--', label='x**2',          color='darkolivegreen',  linewidth=2 )
plt.plot( x, w, '*:',  label='sqrt((x*2)+1)', color='slategrey',       linewidth=4 )
plt.xlabel( 'x' )
plt.ylabel( 'y' )
plt.title( 'Too many options can lead to ugly graphs' )
plt.legend()
plt.show()

You can save a figure to an external file using plt.savefig('filename') instead of plt.show().

Edit the cell above to replace

plt.show()

with

plt.savefig('my_figure.pdf')

Then run the cell to save the figure to the disk.