Tutorial 3: Data Plots¶

This tutorial will talk about how to visualise the distributions that have been built in Tutorial 1.

NOTE FOR CONTRIBUTORS: Always clear all output before commiting (Cell > All Output > Clear)!

In [ ]:

# Magic
%matplotlib inline
# Reload modules whenever they change
%load_ext autoreload
%autoreload 2

# Make clusterking package available even without installation
import sys
sys.path = ["../../"] + sys.path

In [ ]:

from clusterking.data.data import Data
from clusterking.plots import BundlePlot

First we load the data created in Tutorial 1 in the folder output/cluster/ with the name tutorial_basics and pass it to the Data class.

In [ ]:

d = Data("output/cluster/", "tutorial_basics")

This data is then used to create an instance of the BundlePlot class.

In [ ]:

pb = BundlePlot(d)

We are now ready to visualise our created data: Let's start by drawing the histograms corresponding to the benchmark points of each clusters by typing:

In [ ]:

pb.plot_bundles()

We can also add more sample points to the plot (in addition to the benchmark point):

In [ ]:

pb.plot_bundles(1, nlines=3)

To save the above plot to the output/cluster folder we use the following commnad:

In [ ]:

pb.fig.savefig("output/cluster/example_plot.png")

Showing the minima and maxima of all clusters is achieved with the plot_minmax method.

In [ ]:

pb.plot_minmax()

The same plot for clusters 2 and 3 only:

In [ ]:

pb.plot_minmax([2,3])

Removing the reference line leads to the follwoing output:

In [ ]:

pb.plot_minmax([2,  3], reference=False)

Box plots can be produced using the box_plot method:

In [ ]:

pb.box_plot(reference=True)

Showing clusters 0 and 2 only:

In [ ]:

pb.box_plot([0, 2])