Lesson 42: Interactive plotting with Bokeh and HoloViews

(c) 2018 Justin Bois. With the exception of pasted graphics, where the source is noted, this work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.

This document was prepared at Caltech with financial support from the Donna and Benjamin M. Rosen Bioengineering Center.

This lesson was generated from a Jupyter notebook. You can download the notebook here.



In [1]:
import numpy as np
import pandas as pd
import scipy.integrate

# Use IPython widgets for interacting
import ipywidgets

# Import Bokeh modules for interactive plotting
import bokeh.io

# Import HoloViews for high level plotting
import holoviews as hv

bokeh.io.output_notebook()
hv.extension('bokeh')
Loading BokehJS ...

The data set

In this lesson, we will explore some of Bokeh's features using the finch beak data from Exercise 4. Upon completing that exercise, you should have created a tidy data frame with the data for several years and stored it in data/grant_complete.csv in you repo. If you did not, that's ok; I put it in there. So, let's load the data set.

In [2]:
df = pd.read_csv('data/grant_complete.csv')

To remind us what is in the data set, let's take a quick look.

In [3]:
df.head()
Out[3]:
band beak depth (mm) beak length (mm) species year
0 20123 8.05 9.25 fortis 1973
1 20126 10.45 11.35 fortis 1973
2 20128 9.55 10.15 fortis 1973
3 20129 8.75 9.95 fortis 1973
4 20133 10.15 11.55 fortis 1973

We have beak depth and beak length data for two different species, G. fortis and G. scandens for a variety of years.

Using HoloViews for high level plots

Much like Altair enables high-level plotting where you input a DataFrame, HoloViews offers similar functionality. A nice added benefit is that it automatically gives you some of Bokeh's great interactivity. Let's take it for a spin. Note that we have already imported HoloViews as hv and that we have specified its backend to be Bokeh by calling hv.extension('bokeh').

We'll start by making a scatter plot of beak depth versus beak lengths for both G. fortis and G. scandens.

In [4]:
scatter = hv.Scatter(df, 
                     kdims=['beak length (mm)', 'beak depth (mm)'],
                     vdims=['species', 'band', 'year'])
scatter
Out[4]:

So now we have a plot of beak length versus beak depth for all beaks measured by the Grants.

HoloViews is designed to just be a way to look at your data. You could look at your data as a table, which is what we did in a cell above, and we'll do here, just for fun.

In [5]:
df.head()
Out[5]:
band beak depth (mm) beak length (mm) species year
0 20123 8.05 9.25 fortis 1973
1 20126 10.45 11.35 fortis 1973
2 20128 9.55 10.15 fortis 1973
3 20129 8.75 9.95 fortis 1973
4 20133 10.15 11.55 fortis 1973

In order to look at the data graphically, we need to add some additional information. Specifically, we HoloViews requires that we specify which columns in the DataFrame are key dimensions and which are value dimensions. Key dimensions are indexing dimensions, which say where on the graphic the data in a row will reside. The value dimensions give information about each data point. For example, for a dot at position (8.05, 9.25), corresponding to the first row of the DataFrame, there is also information about the band, species, and year. So, we specified as key dimensions the beak length and beak depth, and as value dimensions the band, species, and year. We did this using the respective kdims and vdims arguments.

We used hv.Scatter to invoke an element of visualization. An element is just a way of converting the tabular nature of the data to a graphical representation, in this case a scatter plot. The set of elements that HoloView has can be found here.

So, we have specified our key dimensions, which HoloViews used to place the cirlce glyphs, but we did not really use the value dimensions to annotate the glyphs. Fortunately, like a DataFrame, a Scatter object has a groupby() method that works as you might expect. It will group the graphical object by the specified columns of the DataFrame. Let's put that to use to group the data by species and year (we can ignore band).

In [6]:
scatter = hv.Scatter(df, 
                     kdims=['beak length (mm)', 'beak depth (mm)'],
                     vdims=['species', 'band', 'year']
                    ).groupby(['species', 'year'])
scatter
Out[6]:

Ah, very nice! HoloViews has made a plot for a given species/year, and allowed us to select them with a pulldown menu and slider widget, which were generated automatically. But, what if we want G. fortis and G. scandens together on the same plot? We can use the overlay() method of the Scatter object to overlay two sets of plots.

In [7]:
scatter = hv.Scatter(df, 
                     kdims=['beak length (mm)', 'beak depth (mm)'],
                     vdims=['species', 'band', 'year']
                    ).groupby(['species', 'year'])
scatter.overlay('species')
Out[7]:

Very nice!

Finally, I might want to customize the appearance of the plot by having grid lines, setting its size, and even specifying that I additionally want a hover tool. This can be done using the options() method, as shown below.

Conversely, if we wanted these options to be active for all Scatter plots in a notebook, we can put the following magic functions at a cell at the top of the notebook.

%%opts Scatter [show_grid=True, width=600, height=300, tools=['hover'], border=10, legend_position='right']
%%opts Scatter (color=hv.Cycle(['dodgerblue', 'tomato']))

Options in brackets are plot options, meaning they specify how the "canvas" on which the glyphs are painted is styled. Options in parentheses are style options, meaning that they specify how the glyphs themselves are styled. See the HoloViews docs for more information.

Note in the following that future editions of Bokeh and HoloViews will allow more refined control over the positioning of legends.

In [8]:
scatter = hv.Scatter(df, 
                     kdims=['beak length (mm)', 'beak depth (mm)'],
                     vdims=['species', 'band', 'year']
                    ).groupby(['species', 'year'])

# Set formatting options
scatter = scatter.options(show_grid=True,
                          width=600, 
                          height=300, 
                          tools=['hover'], 
                          border=10, 
                          legend_position='right',
                          color=hv.Cycle(['dodgerblue', 'tomato']))

scatter.overlay('species')
Out[8]:

We now have a very nice, clearly annotated plot of our data. Importantly, constructing HoloViews elements requires quite similar thinking as you employ when thinking about tidy data frames.

Building interactive plots with Bokeh

You can also hand-build interactive plots with Bokeh. To demonstrate this, we will make an interactive plot of the dynamics of the famous repressilator. We can solve the dynamical system rapidly and plot the result interactively.

The dynamical equations for a simplified repressilator are

\begin{align} \frac{\mathrm{d}x_1}{\mathrm{d}t} &= \frac{\beta}{1+x_3^n} - x_1 \\[1em] \frac{\mathrm{d}x_2}{\mathrm{d}t} &= \frac{\beta}{1+x_1^n} - x_2 \\[1em] \frac{\mathrm{d}x_3}{\mathrm{d}t} &= \frac{\beta}{1+x_2^n} - x_3 \end{align}

We can integrate the dynamical equations to see the levels of the respective proteins. We use interactive plotting of the result so we can see how the dynamics depend on the parameters $\beta$ and $n$. Note that to have these interactions, you need to be running this in JupyterLab and have ipywidgets installed; the interactivity is lost in the static HTML rendering.

In [9]:
# Define right hand side of ODEs
def dx_dt(x, t, beta, n):
    """
    Returns 3-array of (dx_1/dt, dx_2/dt, dx_3/dt)
    """
    x_1, x_2, x_3 = x
    return np.array([beta / (1 + x_3**n) - x_1,
                     beta / (1 + x_1**n) - x_2,
                     beta / (1 + x_2**n) - x_3])

# Initial condiations
x0 = np.array([1, 1, 1.2])

# Time points
t = np.linspace(0, 30, 1000)

# Choose parameters
beta = 10.0
n = 3

# Solve it!
x = scipy.integrate.odeint(dx_dt, x0, t, args=(beta, n))

# Plot the solutions
source = bokeh.models.ColumnDataSource(
            data=dict(x1=x[:,0], x2=x[:,1], x3=x[:,2], t=t))
p = bokeh.plotting.figure(width=600, height=300, x_axis_label='t',
                           border_fill_alpha=0, background_fill_alpha=0)
p.line('t', 'x1', source=source, line_width=3, color='dodgerblue', legend='1')
p.line('t', 'x2', source=source, line_width=3, color='tomato', legend='2')
p.line('t', 'x3', source=source, line_width=3, color='slateblue', legend='3')
p.legend.location = 'top_left'
bokeh.io.show(p, notebook_handle=True)

# Set up callbacks
def update(n=3, beta=10):
    # Generate the new curve
    x = scipy.integrate.odeint(dx_dt, x0, t, args=(beta, n))

    # Re-source data
    source.data = dict(x1=x[:,0], x2=x[:,1], x3=x[:,2], t=t)
    bokeh.io.push_notebook()
    
ipywidgets.interactive(update, n=ipywidgets.FloatSlider(min=1, max=5, value=3), 
                       beta=ipywidgets.FloatSlider(min=1, max=100, value=10))