Exercise 4.6: Building dashboards

Choose to do any (one, two, or all) of the following.

a) In Exercise 3.6, you performed graphical exploratory data analysis with the Darwin finch beak data set. Revisit that data set and build a dashboard for exploring it. You might want to think about adding things like summary statistics with confidence intervals as well.

b) Build a dashboard to explore a data set from either a data repository that is of interest to you, or from your own research.

Solution

THIS SOLUTION IS INCOMPLETE.

[1]:

import numpy as np
import pandas as pd

import bokeh.io
import bokeh.plotting

import iqplot

bokeh.io.output_notebook()

Loading BokehJS ...

a) For my dashboard, I will plot a scatter plot of beak length versus beak depth for each of the two species, allowing selection of which year the user wants highlighted. This is really the only plot I want control of; I would like to view the ECDFs in static plots.

First, we load in the data set.

[2]:

df = pd.read_csv('data/grant_complete.csv')

df.head()

[2]:

	band	beak depth (mm)	beak length (mm)	species	year
0	20123	8.05	9.25	fortis	1973
1	20126	10.45	11.35	fortis	1973
2	20128	9.55	10.15	fortis	1973
3	20129	8.75	9.95	fortis	1973
4	20133	10.15	11.55	fortis	1973

To make sure the data ranges all stay the same I will write a function to get the data range based on all measurements in the data set.

[3]:

def data_range(df, padding=0.05):
    """Range of data for length and depth."""
    bl_range = (df["beak length (mm)"].min(), df["beak length (mm)"].max())
    bd_range = (df["beak depth (mm)"].min(), df["beak depth (mm)"].max())

    bl_diff = bl_range[1] - bl_range[0]
    bd_diff = bd_range[1] - bd_range[0]

    length_range = [
        bl_range[0] - bl_diff * padding,
        bl_range[1] + bl_diff * padding,
    ]
    depth_range = [
        bd_range[0] - bd_diff * padding,
        bd_range[1] + bd_diff * padding,
    ]

    return length_range, depth_range

Next, we will set up a widget for making selections for the year to highlight. We will also allow for a selector that allows choice of how the unselected years are displayed in the scatter plot.

[5]:

year_selector = bokeh.models.Select(
    name="year", options=[str(year) for year in np.sort(df["year"].unique())]
)

other_years_selector = bokeh.models.Select(
    name="other years", options=["hidden", "muted"], value="muted"
)

ecdf_style_selector = bokeh.models.Select(
    name="ECDF style", options=["staircase", "dots"], value="straircase"
)

Now, we’ll make the scatter plot.

[5]:

"""Scatter plot of beak depth vs length."""
colors = {"fortis": "#1f77b3", "scandens": "orange"}

length_range, depth_range = data_range(df)

p = bokeh.plotting.figure(
    frame_width=300,
    frame_height=300,
    x_axis_label="beak length (mm)",
    y_axis_label="beak depth (mm)",
    x_range=length_range,
    y_range=depth_range,
)

if other_years != "hidden":
    for y, sub_df in df.groupby("year"):
        for s, group in sub_df.groupby("species"):
            p.circle(
                source=group,
                x="beak length (mm)",
                y="beak depth (mm)",
                color=colors[s],
                alpha=1 if y == year else 0.05,
            )
else:
    sub_df = df.loc[df["year"] == year, :]
    for s, group in sub_df.groupby("species"):
        p.circle(
            source=group,
            x="beak length (mm)",
            y="beak depth (mm)",
            color=colors[s],
        )

We also use the ECDFs we build in an earlier exercise.

[6]:

@pn.depends(ecdf_style_selector.param.value)
def ecdfs(style):
    """Make ECDFs for beak length and beak depths"""
    length_range, depth_range = data_range(df)

    palette_fortis = bokeh.palettes.Blues9
    p_length_fortis = iqplot.ecdf(
        data=df.loc[df["species"] == "fortis", :],
        q="beak depth (mm)",
        cats="year",
        palette=palette_fortis,
        frame_height=150,
        title="fortis",
        style=style,
        x_range=depth_range,
    )

    p_depth_fortis = iqplot.ecdf(
        data=df.loc[df["species"] == "fortis", :],
        q="beak length (mm)",
        cats="year",
        palette=palette_fortis,
        frame_height=150,
        title="fortis",
        style=style,
        x_range=length_range,
        show_legend=False,
    )

    palette_scandens = bokeh.palettes.Oranges9
    p_length_scandens = iqplot.ecdf(
        data=df.loc[df["species"] == "scandens", :],
        q="beak depth (mm)",
        cats="year",
        palette=palette_scandens,
        frame_height=150,
        title="scandens",
        style=style,
        x_range=p_length_fortis.x_range,
    )

    p_depth_scandens = iqplot.ecdf(
        data=df.loc[df["species"] == "scandens", :],
        q="beak length (mm)",
        cats="year",
        palette=palette_scandens,
        frame_height=150,
        title="scandens",
        style=style,
        x_range=p_depth_fortis.x_range,
        show_legend=False,
    )

    return bokeh.layouts.gridplot(
        [
            [p_length_fortis, p_depth_fortis],
            [p_length_scandens, p_depth_scandens],
        ]
    )

Now, we can lay out our dashboard.

[7]:

pn.Column(
    pn.Row(
        scatter_plot,
        pn.Spacer(width=15),
        pn.Column(
            pn.Spacer(height=30),
            year_selector,
            pn.Spacer(height=15),
            other_years_selector,
            pn.Spacer(height=15),
            ecdf_style_selector,
        ),
    ),
    pn.Spacer(height=15),
    ecdfs,
)

[7]:

The dashboard immediately gives us a picture of how length and depth change over the years independently for each species, encoded by color. The scatter plot shows how they vary together, also in comparison to other years.

b) I look forward to seeing what you do in your own work!

Computing environment

[13]:

%load_ext watermark
%watermark -v -p numpy,pandas,iqplot,bokeh,holoviews,panel,jupyterlab

Python implementation: CPython
Python version       : 3.8.10
IPython version      : 7.22.0

numpy     : 1.20.2
pandas    : 1.2.4
iqplot    : 0.2.3
bokeh     : 2.3.2
holoviews : 1.14.4
panel     : 0.11.3
jupyterlab: 3.0.14