Notebook

Tutorial 0. Setup

This first step to the tutorial will make sure your system is set up to do all the remaining sections, with all software installed and all data downloaded as needed. The index covered background you might want before you start.

Getting set up¶

Please consult holoviz.org for the full instructions on installing the software used in these tutorials. Here is the condensed version of those instructions, assuming you have already downloaded and installed Anaconda or Miniconda:

conda create -n holoviz-tutorial python=3.7
conda activate holoviz-tutorial
conda install -c pyviz holoviz
holoviz examples
cd holoviz-examples
jupyter notebook

See the full instructions at holoviz.org if you don't yet have conda, if you have an old version of conda, or if you are using JupyterLab.

Once everything is installed, run the following cell to test the key imports needed for this tutorial. If it completes without errors your environment should be ready to go:

In [ ]:

import pandas as pd, datashader as ds, dask, bokeh, holoviews as hv  # noqa
from distutils.version import LooseVersion

min_versions = dict(pd='0.24.0', ds='0.7.0', dask='1.2.0', bokeh='1.2.0', hv='1.12.3')

for lib, ver in min_versions.items():
    v = globals()[lib].__version__
    if LooseVersion(v) < LooseVersion(ver):
        print("Error: expected {}={}, got {}".format(lib,ver,v))

And you should see the HoloViews, Bokeh, and Matplotlib logos after running the following cell:

In [ ]:

hv.extension('bokeh', 'matplotlib')

Downloading sample data¶

Lastly, let's make sure the datasets needed are available. First, check that the large earthquake dataset was downloaded correctly during installation:

In [ ]:

import os
if not os.path.isfile('../data/earthquakes.parq'):
    print('Earthquakes dataset not found; please run "holoviz fetch-data --path ..".')

If you don't have holoviz installed, for example if you created the environment from scratch, you can run:

In [ ]:

from pyct import cmd
if not os.path.isfile('../data/earthquakes.parq'):
    cmd.fetch_data(name='holoviz', path='..')

Make that you have the SNAPPY dependency required to read these data:

In [ ]:

try:
    import dask.dataframe as dd
    columns = ['depth', 'id', 'latitude', 'longitude', 'mag', 'place', 'time', 'type']
    data = dd.read_parquet('../data/earthquakes.parq', columns=columns, engine='fastparquet')
    data.head()
except RuntimeError:
    print('SNAPPY is missing, so data cannot be read. If you are pip installing, '
          'follow the steps here: https://github.com/andrix/python-snappy#dependencies')

If you don't see any error messages above, you should be good to go! Now that you are set up, you can continue with the rest of the tutorial sections. You can also skip ahead to sections that speak to your interests more directly, such as the Building Panels, Working with Large Data, or the various real-world examples of using a variety of tools to solve a visualization problem.