The previous tutorial focused on specifying elements and simple collections of them. This one explains how the visual appearance can be adjusted to bring out the most salient aspects of your data, or just to make the style match the overall theme of your document. We'll use data in Pandas, and HoloViews, Bokeh, and Matplotlib to display the results:
HoloViews explicitly makes the distinction between data and plotting options, which allows annotating the data with semantic metadata before deciding how to visualize the data. It also allows rendering the same object using different plotting libraries, such as Bokeh or Matplotlib.
In the annotating your data section, hv.extension('bokeh')
was used at the start to load and activate the bokeh plotting extension. In this notebook, we will also briefly use matplotlib that will be loaded, but not yet activated, by listing it second:
import pandas as pd
import holoviews as hv
from holoviews import opts, dim
hv.extension('bokeh', 'matplotlib')
Let's find some interesting data to generate elements from, before we consider how to customize them. Here is a dataset of information about 50,000 individual diamonds (including their weight in carats and their cut, quality, and price), which provides some rich opportunities for visualization:
diamonds = pd.read_csv('../data/diamonds.csv')
diamonds.head()
One obvious thing to look at is the relationship between the mass of the diamonds given in 'carat' and their 'price'. Since the dataset is large we will sample 1,000 random datapoints from the DataFrame and plot the 'carat' column against the 'price' as a Scatter:
hv.Scatter(diamonds.sample(5000), 'carat', 'price')
There is clearly structure in this data, but the data is also clearly being overplotted and being squashed into a small plot. To fix the problems, we can start customizing the appearance of this object using the HoloViews options system. Later on in the tutorial, we will see an alternative way of avoiding overplotting using Datashader in Working_with_Large_Datasets.
.opts
¶We noted that the data is too compressed in the x direction. Let us fix that by specifying the width
option and additionally spread the data out along the y-axis by enabling a log axis using the logy
option. We will also enable a Bokeh 'hover' tool letting us reveal more information about each datapoint:
scatter = hv.Scatter(diamonds.sample(5000),
'carat', ['price', 'cut']).redim.label(carat='Carat (ct)',
price='Price ($)')
scatter.opts(width=600, logy=True, tools=['hover'])
Here you can see that it's still a plot of Price vs. Carat, but if you hover over a datapoint you can see that the 'cut' information is visible for each point as you visit it.
The bottom line uses the .opts
method to specify the width
, logy
and tools
options applied to the Scatter
object.
In addition to specifying keywords directly on a single element, you can also make use of a convenient, tab-completable options builder (see the user guide for details):
scatter.opts(opts.Scatter(width=600, logy=True, tools=['hover']))
# Exercise: Try inspecting some of the tab-completable keywords for Curve elements
# Note: You can see the available completions by pressing Tab inside opts.<Element>
# Exercise: Try enabling the boolean show_grid plot option for the curve above
hv.help
¶Tab completion helps discover what keywords are available, but you can get more complete help using the hv.help
utility. For instance, to learn more about the options for hv.Scatter
run hv.help(hv.Scatter)
:
# hv.help(hv.Scatter)
The options applied earlier instructed HoloViews to build a plot 600 pixels wide, when rendered with the Bokeh plotting extension. Now let's specify that the Bokeh glyph should be colored by the 'cut' column using the 'Set1' colormap and reduce the 'alpha' and 'size' of the points so we can see overlapping points better:
scatter.opts(color=dim('cut'), alpha=0.5, cmap='Set1')
Note how the plot options applied above to scatter
are remembered! The .opts
method allows incremental customization of an object without storing those options on the object itself. Behind the scenes HoloViews has linked the specified keyword options to the scatter
object via a hidden integer id attribute.
Having used the .opts
method on scatter
again, we have now associated the 'alpha', 'size' and 'cmap' options to it. Unlike the width
and logy
options specified earlier, these options are defined by Bokeh and belong to the corresponding scatter glyph. See the HoloViews user guide for more information on the difference between these two types of options (called 'plot' and 'style' options respectively).
# Exercise: Display scatter without any new options to verify it stays colored
# Exercise: Try setting the 'size' style options to 1
# Exercise: Try using an options builder and exploring some of the completions
Let us now view our curve with matplotlib using the hv.output
utility:
hv.output(scatter, backend='matplotlib')
All our options are gone! This is because the options are associated with the corresponding plotting extension---if you switch back to 'bokeh', the options will be applicable again. In general, options have to be specific to backends; e.g. the size
style option accepted by Bokeh is called s
in matplotlib.
Let us briefly make matplotlib the default plotting extension using hv.output
without specifying an object to customize:
hv.output(backend='matplotlib')
Now we can apply the matplotlib specific options:
selection = scatter.select(carat=(0, 3))
selection.opts(aspect=4, fig_size=400, color='blue', s=4, alpha=0.2)
# Exercise: Apply the color and alpha options as above, but to the matplotlib plot
With the matplotlib plotting extension still active, we can use hv.output
to specify that we want SVG output instead of PNG:
hv.output(fig='svg')
Now we can generate an SVG image using a different set of matplotlib options:
scatter.opts(aspect=4, fig_size=400, xrotation=70, color='green', s=10, marker='^')
# Exercise: Verify for yourself that the output above is SVG and not PNG
# You can do this by right-clicking above then selecting 'Open Image in a new Tab' (Chrome) or 'View Image' (Firefox)
In previous releases of HoloViews, it was typical to switch to matplotlib in order to export to PNG or SVG, because Bokeh did not support these file formats. Since Bokeh 0.12.6 we can now easily use HoloViews to export Bokeh plots to a PNG file, as we will now demonstrate:
hv.output(backend='bokeh')
By passing fmt='png'
and a filename='diamonds'
to hv.save
we can save the output to a PNG file before displaying the plot again:
#%%output fig='png' filename='diamonds'
hv.save(scatter, fmt='png', filename='diamonds')
scatter
Here we have requested PNG format using fig='png'
and that the output should go to diamonds.png using filename='diamonds'
:
ls *.png
Bokeh also has some SVG support, but it is not yet exposed in HoloViews.
group
and label
¶The above examples showed how to customize by Element type, but HoloViews offers multiple additional levels of customization that should be sufficient to cover any purpose. For our last example, let us split our diamonds dataframe based on the clarity of the diamonds, selecting the lowest and highest clarity:
low_clarity = diamonds[diamonds.clarity=='I1']
high_clarity = diamonds[diamonds.clarity=='IF']
opts.defaults(
opts.Spikes(width=900, logx=True, xticks=8, xrotation=90))
This time we will visulize the same the data as a dot graph by combining Scatter
elements with the Spikes
, showing the distribution of prices between the low and high clarity groups.
We can do this using the element group
and label
introduced in the annotating your data section as follows:
overlay = (hv.Spikes( low_clarity, 'price', 'carat', group='Diamonds', label='Low')
* hv.Scatter( low_clarity, 'price', 'carat', group='Diamonds', label='Low')
* hv.Spikes( high_clarity, 'price', 'carat', group='Diamonds', label='High')
* hv.Scatter(high_clarity, 'price', 'carat', group='Diamonds', label='High'))
overlay.opts(
opts.Spikes('Diamonds.Low', color='blue'),
opts.Spikes('Diamonds.High', color='red'))
Using the color option to distinguish between the two categories of data we can now see the clear difference between the two groups, showing that diamonds with a low clarity need to have much higher mass in carats to obtain the same price. Similar techniques can be used to provide arbitrarily specific customizations when needed.
# Exercise: Remove the call to the .opts method above and observe the effect
# Exercise: Give the 'Low' clarity scatter points a black 'line_color'
# Optional Exercise: Try differentiating the two sets of spikes by group and not label
We have now seen some of the ways you can customize the appearance of your visualizations. You can consult our Customizing Plots user guide to learn about other approaches, including the notebook-specific magic syntax that was used in older versions of HoloViews, as well as how to clear options using the .opts.clear
method.
You may also wish to consult the extra A1 Exploration with Containers tutorial, which gives examples of how the appearance of elements can be customized when viewed in containers. In the next tutorial, Working with Tabular Data we will see how to use the flexibility offered by HoloViews when working with tabular data.