#!/usr/bin/env python # coding: utf-8 # # # # # #
# # # # #

Bokeh Tutorial

#
# #

A3. High-Level Charting with Holoviews

# Bokeh is designed to make it possible to construct rich, deeply interactive browser-based visualizations from Python source code. It has a syntax more compact and natural than older libraries like Matplotlib, but still requires a good bit of code to do relatively common data-science tasks like complex multi-figure layouts, animations, and widgets for parameter space exploration. # # To make it feasible to generate complex interactive visualizations "on the fly" in Jupyter notebooks while exploring data, we have created the [HoloViews](http://holoviews.org) library built on top of Bokeh. # # HoloViews allows you to annotate your data with a small amount of metadata that makes it instantly visualizable, usually without writing any plotting code. HoloViews makes it practical to explore datasets and visualize them from every angle interactively, wrapping up Bokeh code for common tasks into a set of configurable and composable components. HoloViews installs separately from Bokeh, e.g. using `conda install holoviews`, and also works with matplotlib. # In[1]: import holoviews as hv import numpy as np from holoviews import opts hv.extension('bokeh') # # A simple function # First, let us define a mathematical function to explore, using the Numpy array library: # In[2]: def sine(x, phase=0, freq=100): return np.sin((freq * x + phase)) # We will examine the effect of varying phase and frequency: # In[3]: phases = np.linspace(0,2*np.pi,7) # Explored phases freqs = np.linspace(50,150,5) # Explored frequencies # Over a specific spatial area, sampled on a grid: # In[4]: dist = np.linspace(-0.5,0.5,81) # Linear spatial sampling x,y = np.meshgrid(dist, dist) grid = (x**2+y**2) # 2D spatial sampling # # Succinct data visualization # With HoloViews, we can immediately view our simple function as an image in a Bokeh plot in the Jupyter notebook, without any coding: # In[5]: hv.__version__ # In[6]: hv.Image(sine(grid, freq=20)) # But we can just as easily use ``+`` to combine ``Image`` and ``Curve`` objects, visualizing both the 2D array (with associated histogram) and a 1D cross-section: # In[7]: grating = hv.Image(sine(grid, freq=20), label="Sine Grating") ((grating * hv.HLine(y=0)).hist() + grating.sample(y=0).relabel("Sine Wave")) # Here you can see that a HoloViews object isn't really a plot (though it generates a Bokeh Plot when requested for display by the Jupyter notebook); it is just a wrapper around your data, and the data can be processed directly (as when taking the cross-section using `sample()` here). In fact, your raw data is *always* still available,allowing you to go back and forth between visualizations and numerical analysis easily and flexibly: # In[8]: grating[0,0] # In[9]: type(grating.data) # Here the underlying data is the original Numpy array, but Python dictionaries as well as Pandas and other data formats can also be supplied. # # The underlying objects and data can always be retrieved, even in complex multi-figure objects, if you look at the `repr` of the object to find the indexes needed to address that data: # In[10]: layout = ((grating * hv.HLine(y=0)) + grating.sample(y=0)) print(repr(layout)) layout.Overlay.Sine_Grating.Image.Sine_Grating[0,0] # Here `layout` is the name of the full complex object, and `Overlay.Sine_Grating` selects the first item (an HLine overlaid on a grating), and `Image.Sine_Grating` selects the grating within the overlay. The grating itself is then indexed by 'x' and 'y' as shown in the repr, and the return value from such indexing is 'z' (nearly zero in this case, which you can also see by examining the curve plot above). # # # Interactive exploration # # HoloViews is designed to explore complicated datasets, where there can often be much more data than can be shown on screen at once. If there are dimensions to your data that have not been laid out as adjacent plots or overlaid plots, then HoloViews will automatically generate sliders covering the remaining range of the data. For instance, if we add an additional dimension `Y` indicating the location of the cross-section, we'll get a slider for `Y`: # In[11]: positions = np.linspace(-0.3, 0.3, 17) hv.HoloMap({y: (grating * hv.HLine(y)) for y in positions}, kdims='Y') + \ hv.HoloMap({y: (grating.sample(y=y)) for y in positions}, kdims='Y') # By default the data will be embedded fully into the output, allowing export to static HTML/JavaScript for distribution, but for parameter spaces too large or using dynamic data, a dynamic callback can be used with a callback that generates the data on the fly using a [DynamicMap](http://holoviews.org/Tutorials/Dynamic_Map.html). # # # # Setting display options # # HoloViews objects like `grating` above directly contain only your data and associated metadata, not any plotting details. Metadata like titles and units can be set on the objects either when created or subsequently, as shown using `label` and `relabel` above. # # Other properties of the visualization that are just about the view of it, not the actual data, are not stored on the HoloViews objects, but in a separate data structure. To make it easy to control such options : # In[12]: ((grating * hv.HLine(y=0)).hist() + grating.sample(y=0)).opts( opts.HLine(line_color='white', line_width=9), opts.Image(cmap='RdYlGn'), opts.Curve(color='b', line_dash="dotted"), ) # Options may also be set globally using the `hv.opts.defaults` function, allowing you to alter global defaults. The `opts.` accessors will tab-complete all the available options in the IPython note. Try adding an additional option to change the `opts.Points` default style: # In[13]: hv.opts.defaults( opts.Points(size=3) ) ### EXERCISE: Try changing various parameters in the above plot, using tab completion to discover the names and values # ## Normalizing your data # HoloViews is designed to make it easy to understand your data. For instance, consider two circular waves with very different amplitudes: # In[14]: comparison = hv.Image(sine(grid)) + hv.Image(sine(grid, phase=np.pi)*0.02) # HoloViews ensures that these differences are visible by default, by normalizing across any elements of the same type that are displayed together, and even across the frames of an animation: # In[15]: comparison = hv.Image(sine(grid)) + hv.Image(sine(grid, phase=np.pi)*0.02) comparison.opts('Image', cmap='gray') # This default visualization makes it clear that the two patterns differ greatly in amplitude. However, it is difficult to see the structure of the low-amplitude wave in **B**. If you wish to focus on the spatial structure rather than the amplitude, you can instruct HoloViews to normalize data in different axes separately: # In[16]: comparison.opts(shared_axes=False) # Similarly, you could supply ``+framewise`` to tell it to normalize data per frame of an animation, not across all frames as it does by default. As with any other customization, you can always specify which specific element you want the customization to apply to, even in a complex multiple-subfigure layout. # # # External data sources # # To conveniently work with external datasource such as Pandas dataframes, HoloViews provides a `.plot` like API for DataFrames using the `hvPlot` library. Let us load the iris dataset and explore how to use it. # In[17]: from bokeh.sampledata.iris import flowers import hvplot.pandas flowers.head() # hvPlot gives you a fully interactive Bokeh plot in just a single line of code: # In[18]: flowers.hvplot.scatter('petal_length', 'petal_width') # And even more complicates breaking out each `'species'`: # In[19]: flowers.hvplot.scatter('petal_length', 'petal_width', by='species', legend='top_left') # Lastly it let's you quickly explore complex datasets by grouping over 1 or more dimensions and generating widgets to explore the resulting parameter space: # In[20]: flowers.hvplot.scatter('petal_length', 'petal_width', groupby='species') # Lastle we can easily generate multiple subplots to generate a "small multiple" plot: # In[21]: flowers.hvplot.scatter('petal_length', 'petal_width', by='species', subplots=True, width=400) # As you can see HoloViews and hvPlot provide a high level API to work with complex datasets using interactive Bokeh plots. # # Interactive features # # HoloViews also simplifies the ability to leverage deeply interactive features such as linked brushing. To demonstrate we will load the `autompg` which contains various statistics of a number of car models. The `hvplot.scatter_matrix` will generate a linked matrix of plot to explore all dimensions: # In[ ]: from bokeh.sampledata.autompg import autompg hvplot.scatter_matrix(autompg, c='origin').opts( opts.Histogram(alpha=0.4) ) # HoloViews provides a range of options to provide deep interactivity including more configurable [linked brushing features](http://holoviews.org/user_guide/Linked_Brushing.html), interactive [data transformation](http://holoviews.org/user_guide/Transforming_Elements.html) and [data processing pipelines](http://holoviews.org/user_guide/Data_Pipelines.html). # # Learning more # # If you are interested in using HoloViews in your workflow, check out the extensive tutorials at [holoviz.org](https://holoviz.org), the [hvPlot documentation](https://hvplot.holoviz.org/) and the user guides at [holoviews.org](https://holoviews.org/user_guide/) and [geoviews.org](http://geoviews.org/) for geographic applications. # # Next Section # To explore the next topic in the appendices, click on this link: [A4 - Additional Resources](A4%20-%20Additional%20Resources.ipynb) # # To go back to the overview, click [here](00%20-%20Introduction%20and%20Setup.ipynb).