Lesson 41: Interactive plotting with Bokeh

(c) 2018 Justin Bois. With the exception of pasted graphics, where the source is noted, this work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.

This document was prepared at Caltech with financial support from the Donna and Benjamin M. Rosen Bioengineering Center.

This lesson was generated from a Jupyter notebook. You can download the notebook here.



In [1]:
import numpy as np
import pandas as pd

import skimage
import skimage.io

# Import Bokeh modules for interactive plotting
import bokeh.io
import bokeh.models
import bokeh.palettes
import bokeh.plotting

import bootcamp_utils

# Display graphics in this notebook
bokeh.io.output_notebook()
Loading BokehJS ...

Before we begin, it is important that you are using the latest version of Bokeh, v. 0.12.6. After importing, verify that this is the case.

In [2]:
bokeh.__version__
Out[2]:
'0.12.16'

If you need to update Bokeh, you may do so at the command line:

conda update bokeh

Importantly, Bokeh is gearing up for its 1.0 release, which means there may be some API changes in the not-so-distant future. And there will certainly be enhancements. Bear this in mind in this lesson and when writing code that uses Bokeh.

I hope you have enjoyed using Altair. It's an intuitive way to generate beautiful plots. Sometimes, though, you need a lower level plotting library to have more control over your plots. In the Python visualization universe, Bokeh (pronounced BOH-kay) is my favorite. It has a clean grammar and allows construction of interactive graphics. In this tutorial, we will explore using it.

The data set

In this lesson, we will explore some of Bokeh's features using the finch beak data from Exercise 3. Upon completing that exercise, you should have created a tidy data frame with the data for several years and stored it in a file called grant_complete.csv in you repo. If you did not, that's ok; I put it in the bootcamp repository in the directory ~/git/bootcamp/data/.

In [3]:
df = pd.read_csv('data/grant_complete.csv')

To remind us what is in the data set, let's take a quick look.

In [4]:
df.head()
Out[4]:
band beak depth (mm) beak length (mm) species year
0 20123 8.05 9.25 fortis 1973
1 20126 10.45 11.35 fortis 1973
2 20128 9.55 10.15 fortis 1973
3 20129 8.75 9.95 fortis 1973
4 20133 10.15 11.55 fortis 1973

We have beak depth and beak length data for two different species, G. fortis and G. scandens for a variety of years.

Making a Bokeh plot

The pipeline for making a plot using Bokeh is to first specify the "canvas" on which you want to paint your data, and then "paint" your data. For example, if we wanted to make a scatter plot of the beak length/depth data, we first think about what space it should occupy. Specifically, we want a figure that is in Cartesian coordinates 400 pixels high and 600 wide. Now, we can start thinking about what we want each axis in the plot to represent. We will say that the x-axis represents beak length and the y-axis beak depth. So, the first thing we do is to make a figure we will work with. So far, the data are not involved at all in the plotting process.

In [5]:
# Build figure
p = bokeh.plotting.figure(height=400,
                          width=600,
                          x_axis_label='beak length (mm)',
                          y_axis_label='beak depth (mm)')

Next, we might think about what we want to happen when we hover over the data. We will display the band number, and also the values of the beak length and depth (just to demonstrate how to format numbers in the hover). Notice that specifying hovers with columns that have spaces, we place the column name in braces. The braces following specify the format for display of the number, in this case as a floating with two places past the decimal.

In [6]:
# Set up hover tool
hover = bokeh.models.HoverTool(tooltips=[('band', '@band'), 
                                         ('length', '@{beak length (mm)}{0.2f}'),
                                         ('depth', '@{beak depth (mm)}{0.2f}')])

# Add the tool to the figure
p.add_tools(hover)

Note that we still have not invoked the actual data. We have been setting up the space the data will occupy and how we will interact with it. Now that that is all in place, we can start to populate our plot with data. First, let's set up the indices we want for extraction from the (tidy) DataFrame of finch beak data.

In [7]:
# For convenience, the indices we want for the species
inds_f = (df['year']==1987) & (df['species']=='fortis')
inds_s = (df['year']==1987) & (df['species']=='scandens')

Now, it's a matter of populating the plot with the data. In Bokeh, the figure we created has methods for populating it with data. The name of the method is the name of the glyph you want to use to represent your data. In our case, we will use a circle. The p.circle() function takes $x$ and $y$ values as inputs. These may be either Numpy arrays, lists, or column headings in a DataFrame. If they are the latter, we specify a data source using the source kwarg.

Before we specify that, let's decide on the colors we want to use.

In [8]:
# Use standard D3 colors
colors = bokeh.palettes.d3['Category10'][10]

Next, we populate the glyphs. When we specify them, we can also specify their color, transparency, and what string should be associated with them in the legend.

In [9]:
# Paint the glyphs
p.circle(x='beak length (mm)',
         y='beak depth (mm)',
         source=df.loc[inds_f, :],
         color=colors[0], 
         alpha=0.25,
         legend='fortis')
p.circle(x='beak length (mm)',
         y='beak depth (mm)',
         source=df.loc[inds_s, :],
         color=colors[1],
         alpha=0.25,
         legend='scandens')

bokeh.io.show(p)

Displaying images in Bokeh

Bokeh can also be used to display images, which is useful to zoom in to regions of interest. When doing so, you explicitly need to specify the plot dimensions by using the plot_height, plot_width, x_range, and y_range kwargs of bokeh.plotting.figure(). Below is a function to display an image. Note that I choose viridis, a perceptual color map, as the colormap.

In [10]:
def bokeh_imshow(im, color_mapper=None, plot_height=400):
    """
    Display an image in a Bokeh figure.
    """
    # Get shape
    n, m = im.shape

    # Set up figure with appropriate dimensions
    plot_width = int(m/n * plot_height)
    p = bokeh.plotting.figure(plot_height=plot_height, plot_width=plot_width, 
                              x_range=[0, m], y_range=[0, n],
                              tools='pan,box_zoom,wheel_zoom,reset')

    # Set color mapper; we'll do Viridis with 256 levels by default
    if color_mapper is None:
        color_mapper = bokeh.models.LinearColorMapper(bokeh.palettes.viridis(256))

    # Display the image
    im_bokeh = p.image(image=[im[::-1,:]], x=0, y=0, dw=m, dh=n, 
                       color_mapper=color_mapper)
    
    return p

Let's use this function to look at an image of bacteria using Bokeh.

In [11]:
im = skimage.io.imread('data/bsub_100x_phase.tif')

p = bokeh_imshow(im)
bokeh.io.show(p)

This function works fine, but I wrote a function, bootcamp_utils.imshow() module that is useful for displaying images. The function allows kwargs interpixel_distance and length_units that allow for specifying physical distances for the axes. Traditionally, people burn scale bars into images. This is akin to geologists putting their pick ax in photos to give a sense of scale. It is absolutely essential that the length scale is clear in any image you display for research purposes. So long as we display the image with the axes labeled with real length units, we do not need to burn a scale bar. This is preferred, because when zooming in on an image, you can lose the scale bar, but the axes remain.

In [12]:
p = bootcamp_utils.bokeh_imshow(im, interpixel_distance=0.0636, length_units='µm')
bokeh.io.show(p)

Exporting plots

Bokeh offers three main options for exporting plots. As we look at them, we will again create the plot of finch beak data from 1987, this time with only the band number showing up when hovering.

In [13]:
# Build figure
p = bokeh.plotting.figure(height=400, width=600, x_axis_label='beak length (mm)',
                         y_axis_label='beak depth (mm)')

# Set up hover tool
hover = bokeh.models.HoverTool(tooltips=[('band', '@band')])

# Add the tool to the figure
p.add_tools(hover)

# Paint the glyphs
p.circle('beak length (mm)', 'beak depth (mm)', source=df.loc[inds_f, :], color=colors[0], 
         alpha=0.25)
p.circle('beak length (mm)', 'beak depth (mm)', source=df.loc[inds_s, :], color=colors[1],
         alpha=0.25);

We are not bothering to show the plot here, since it is above, and we are demonstrating output.

The first, and easiest, way to export an image is to click on the 3.5 inch floppy disk icon appearing next to the plot. This will export the plot as a PNG file with reasonable resolution. In my experience, this resolution is sufficient for using the plot in a presentation. You can also export to PNG programmatically.

In [14]:
bokeh.io.export_png(p, filename='beaks_1987.png')
Out[14]:
'/Users/Justin/Dropbox/git/programming_bootcamp/2018/lessons/beaks_1987.png'

The function bokeh.io.save() returns the full path of the saved file, so you conveniently know where it is.

The second, and most common and useful in my opinion, way is to export the plot as an HTML file. This HTML file can be opened in any browser and will have full interactivity. You could, for example, email the HTML file to your boss, or submit it with a paper. For the time being, this will mostly be in the supplemental materials of a paper, but the paper of the future is interactive, and plots like these will be regularly incorporated into papers.

In [15]:
# First specify the output file
bokeh.io.output_file('beaks_1987.html', title='Daphne Major finch beaks 1987')

# Save it to HTML
bokeh.io.save(p)
Out[15]:
'/Users/Justin/Dropbox/git/programming_bootcamp/2018/lessons/beaks_1987.html'

Finally, if you want to include a plot in a paper of the past (which is often also how the paper of the present is formatted), you want to export vector graphics. These files can be opened and edited in your favorite vector graphics editing software, like Inkscape or Adobe Illustrator. They can also be opened with any modern web browser. You can also convert them to PDF using utilities like CairoSVG. So, let's make a nice vector graphics plot!

In [16]:
# Specify that p's output is SVG
p.output_backend = 'svg'

# Export to SVG
bokeh.io.export_svgs(p, 'beaks_1987.svg')

# Switch p's output back to HTML canvas, which is more performant for interactivity
p.output_backend = 'canvas'

There is a lot more you can do with Bokeh. You can explore more here. Importantly, you can do calculation behind the scenes for your plots, which expands your capabilities to do both analysis and visualization concurrently. We will talk about how to do this on the last day of the bootcamp.