Exercise 11.1: Practice with HoloViews


a) In a population or size \(N\) with frequency \(p\) of allele A and frequency \(q\) of allele a (with \(q = 1 - p\)), the variance of allele frequency \(p\) across replicated populations is approximately

\begin{align} V \approx pq\left(1 - \mathrm{e}^{-t/2N}\right), \end{align}

where \(t\) is the number of generations. Use HoloViews to plot this function version \(t\) for some given values of \(p\) and \(N\). If you’re feeling adventurous, you can make widgets to vary \(p\) and \(N\) and investigate how the variance changes.

b) Recall the data set from Exercise 9.1 in which we analyzed the effects of neonicotinoid pesticides on bee sperm. Use HoloViews to make a scatter plot of the number of alive sperm versus number of dead sperm, with each point colored by whether they were treated with pesticide. As a reminder, the data set is stored in ~git/bootcamp/data/bee_sperm.csv.

c) 96 well plates are often used in analyzing biochemical reactions. Some absorbance data from a 96 well plate experiment are in the file ~git/bootcamp/data/96_well.csv. (You will need to git pull upstream master to get the data set.) Use HoloViews’s HeatMap element to make a display of the data. Hint: If you want to display a colorbar, you may need to also include colorbar_opts={'bar_line_color': None} in your opts() because of a possible bug in HoloViews for rendering colorbars.

Solution


[1]:
import numpy as np
import pandas as pd

import holoviews as hv
hv.extension('bokeh')

import panel as pn
pn.extension()

import bootcamp_utils.hv_defaults
bootcamp_utils.hv_defaults.set_defaults()

a) I will write a function to use HoloViews to generate the plot and allow \(p\) and \(N\) to be varied using sliders. Importantly, I set the limits of the y-axis to go from zero to 0.25, which is the range of values the variance can take. I could plot the variance using hv.Curve or hv.Scatter; either is valid since we usually talk about discrete generations, but the approximate expression for the variance makes a continuum approximation. I will use hv.Curve().

[2]:
p_slider = pn.widgets.FloatSlider(
    name="p", start=0, end=1, value=0.5, step=0.01
)
N_slider = pn.widgets.FloatSlider(name="N", start=1, end=100, value=30)


@pn.depends(p_slider.param.value, N_slider.param.value)
def plot_variance(p, N):
    t = np.linspace(0, 100, 200)
    V = p * (1 - p) * (1 - np.exp(-t / 2 / N))

    return hv.Curve(((t, V)), kdims="t", vdims="V").opts(ylim=(0, 0.26))


pn.Row(
    plot_variance,
    pn.Spacer(width=15),
    pn.Column(pn.Spacer(height=15), p_slider, pn.Spacer(height=15), N_slider),
)

Data type cannot be displayed:

Data type cannot be displayed:

[2]:

b) For this plotting task, we read in the data set, specify the key dimensions to include the number of dead sperm (in millions) and the number of alive sperm (also in millions). We specify the treatment as a value dimensions so we can use it in a groupby/overlay operation to get the coloring of the data points.

[3]:
df = pd.read_csv('data/bee_sperm.csv', comment='#')

hv.Points(
    df,
    kdims=['Dead Sperm Millions', 'Alive Sperm Millions'],
    vdims=['Treatment']
).groupby(
    'Treatment'
).overlay(
)

Data type cannot be displayed:

[3]:

c) We load in the data frame and make the heat map. The key dimensions are set to position the measurements in the heat map and the value dimensions give the color. We include a colorbar. In the options, we set the frame height and width such that each entry in the heat map is square. We invert the y axis and put the x-axis label on top as is traditional for labeling of 96 well plates. We also include a hover tool so that we can see annotations about which well is experiment versus control.

[4]:
df = pd.read_csv("data/96_well.csv")

hv.HeatMap(
    data=df, kdims=["column", "row"], vdims=["absorbance", "experiment"],
).opts(
    colorbar=True,
    colorbar_opts={"title": "absorbance", "bar_line_color": None},
    frame_height=250,
    frame_width=250 * 3 // 2,
    invert_yaxis=True,
    tools=["hover"],
    xaxis="top",
    xlabel="",
    ylabel="",
)

Data type cannot be displayed:

[4]:

Computing environment

[5]:
%load_ext watermark
%watermark -v -p numpy,pandas,bootcamp_utils,bokeh,holoviews,panel,jupyterlab
CPython 3.7.7
IPython 7.16.1

numpy 1.18.5
pandas 0.24.2
bootcamp_utils 0.0.6
bokeh 2.1.1
holoviews 1.13.3
panel 0.9.7
jupyterlab 2.1.5