Exercise 6.3: Growth curves from a movie


At the dawn of the molecular revolution in biology, key experiments by Jacques Monod in which he measured growth curves of bacteria under different conditions exposed some of the mechanisms of regulation of gene expression. Those growth curves were measured in a bulk solution. In this exercise, we will measure bacterial growth starting from two bacteria. The movie shows Bacillus subtilis constitutively expressing mCherry growing under slow growth conditions. These data were kindly donated by Jin Park from the Elowitz lab at Caltech.

a) Load in the series of images contained in the directory ~git/bootcamp/data/bacterial_growth/. Be sure that however you store them (a list or tuple or other object) has the frames in the proper order.

b) Segment the images to separate bacteria from background. You do not need to segment individual bacteria; this would likely require some more advanced techniques involving edge detection that we haven’t covered in bootcamp.

c) Show a representative image from the stack (with the segmentation overlayed) of images. Be sure to check out the README file in the directory containing the images to get the interpixel distance.

d) If \(m\) is the mass of bacteria, which is proportional to the area in the images, then, for exponential growth,

\begin{align} m(t) = m_0\,\mathrm{e}^{rt}, \end{align}

where \(r\) is the growth rate. Taking the logarithm of both sides gives

\begin{align} \ln m(t) = \ln m_0 + rt. \end{align}

So, the slope of the line on a log-log scale is the growth rate.

With this in mind, plot the bacterial area as a function of time with the area on a log scale. If the plot is linear, the bacteria are performing exponential growth.

e) To get the slope of the line, you can use the np.polyfit() function and perform the regression on a semilog scale. Alternatively, you can fit the exponential function directly using scipy.optimize.curve_fit(). Once you find the growth rate (and intercept), plot the theoretical growth curve along with the data.

Solution


[1]:
import os
import glob

import numpy as np
import pandas as pd

import skimage.io
import skimage.morphology
import skimage.segmentation
import skimage.measure

import holoviews as hv
hv.extension('bokeh')

import panel as pn
pn.extension()

import bokeh.io
bokeh.io.output_notebook()
Loading BokehJS ...

a) We will load the images into a tuple, where the index of the tuple corresponds to the frame number. Under normal circumstances, we would do data validation to make sure there are no skipped frames, but in the interest of brevity, we will proceed assuming all data are properly labeled and organized.

[2]:
# Use a list comprehension to read in the images
ims = [
    skimage.io.imread(fname)
    for fname in sorted(glob.glob("data/bacterial_growth/bacillus_*.tif"))
]

# Store it as a tuple so we don't mess with it
ims = tuple(ims)

# How many?
print("There are", len(ims), "images.")
There are 55 images.

To visualize the images before we get going, and also to visualize our success in segmentation, we can build a dashboard that shows the images and optionally an overlay of the segmentation mask. We first need a function to create the overlay.

[3]:
def make_overlay(im, im_bw):
    """Create RGB image with cyan channel saturated via segmentation mask."""
    # Make float and normalized image
    im = np.copy(im).astype(float)

    # Build RGB image by stacking grayscale images
    im_rgb = np.dstack(3 * [im / im.max()])

    # Saturate cyan channel wherever there are white pixels in thresh image
    im_rgb[im_bw, 0] = 1.0

    return im_rgb

And now we can write a function to make the dashboard. It would be easier to make the dashboard and display it here. It will get automatically updated whenever we change the thresholding. So, after updating our thresholding method (with thresholded images stored in ims_bw), we should return to the dashboard to check it out.

[4]:
def dashboard():
    ip_distance = 0.0645
    dt = 15
    frame_width = 200
    frame_height = frame_width * ims[0].shape[0] // ims[1].shape[1]
    bounds = [
        0,
        0,
        ims[0].shape[1] * ip_distance,
        ims[0].shape[0] * ip_distance,
    ]
    opts = dict(
        xlabel="µm",
        ylabel="µm",
        frame_width=frame_width,
        frame_height=frame_height,
    )

    frame_slider = pn.widgets.IntSlider(name="frame", start=1, end=55, value=1)
    overlay_selector = pn.widgets.Checkbox(name="overlay", value=False,)

    @pn.depends(frame_slider.param.value, overlay_selector.param.value)
    def _show_bacillus(frame, overlay):
        im = ims[frame - 1]
        title = f"t = {dt*(frame-1)} min"

        if overlay:
            im_rgb = make_overlay(im, ims_bw[frame - 1])
            return hv.RGB(im_rgb, bounds=bounds).opts(title=title, **opts)
        else:
            return hv.Image(im, bounds=bounds).opts(
                title=title, cmap="viridis", **opts
            )

    return pn.Row(
        _show_bacillus,
        pn.Spacer(width=15),
        pn.Column(
            pn.Spacer(height=30),
            frame_slider,
            pn.Spacer(height=15),
            overlay_selector,
        ),
    )


dashboard()
[4]:

b) For segmentation, we will take a similar approach as in Lesson 37 to segment the image. We will simply threshold the image using Otsu’s method. Since the dashboard is immediately above this cell, we can use it to look at this

[5]:
ims_bw = [im > skimage.filters.threshold_otsu(im) for im in ims]

Looking again at the dashboard, this is pretty good. Certainly good enough for generating a growth curve.

c) See the dashboard above.

The segmentation overshoots the bacteria. We might want to try to refine this a bit. We could do adaptive thresholding. We perform automatic thresholding on subimages throughout the image. We will take subimages to be 51 \(\times\) 51, since that would span the width of one bacterium, and then some, thereby avoiding the issue of thresholding within a single bacterium (and adaptive thresholding needs an odd image block size). Adaptive thresholding is achieved using skimage.filters.threshold_local(). Let’s give it a shot.

[6]:
# List of thresholded images
ims_bw = [im > skimage.filters.threshold_local(im, 51) for im in ims]

Checking out the dashboard again…. Whoa! It did a great job with the bacteria, but the background got really messed up. We could include pixels that are unity in both the adaptive and Otsu thresholding to take care of this. We then use np.logical_and() to do an element-by-element AND operation on the arrays.

[7]:
def thresh(im, block_size=51):
    """Combined adaptive and Otsu thresholding."""
    thresh_otsu = skimage.filters.threshold_otsu(im)
    im_bw = im > thresh_otsu
    return np.logical_and(im > skimage.filters.threshold_local(im, block_size), im_bw)

# Threshold images
ims_bw = [thresh(im) for im in ims]

Now looking at the dashboard, it is very nice!

d) To plot the growth curve, we can plot the total bacterial area on the \(y\)-axis (on a log scale), versus time on the \(x\)-axis. To get the total area, we need to compute the total number of “bacterial” pixels in a given image, multiplied by the pixel area, which is \((64.5\text{ nm})^2 = 4160.25\text{ nm}^2\). Also, according to the metadata, we have one frame every 15 minutes.

[8]:
# Compute pixel area
pixel_area = 0.0645**2

# Get total bacterial area
bac_area = pixel_area * np.array([np.sum(im_bw) for im_bw in ims_bw])

# Get time in units of hours
t = 0.25 * np.arange(len(ims_bw))

# Plot the result
p = bokeh.plotting.figure(
    height=250,
    width=400,
    y_axis_type='log',
    x_axis_label='time (hours)',
    y_axis_label='bacterial area (µm²)'
)

p.circle(
    x=t,
    y=bac_area
)

bokeh.io.show(p)

This looks pretty linear, suggesting that the bacteria are, indeed, performing exponential growth.

e) To perform the regression on the growth curve, we use np.polyfit() with the x values being time, and the y values being the natural log of the bacterial area. We fit with a first order polynomial, which is a line.

[9]:
# Use polyfit to get slope and intercept
slope, intercept = np.polyfit(t, np.log(bac_area), 1)

# Pull out and print parameters
print("""
m0 = {0:.2f} sq. µm
 r = {1:.2f} 1/hours
""".format(np.exp(intercept), slope))

# Generate smooth curve
t_smooth = np.linspace(0, t.max(), 200)
y_smooth = np.exp(intercept + slope * t_smooth)

# Add to the plot
p.line(
    x=t_smooth,
    y=y_smooth,
    color='orange',
    line_width=2,
)

bokeh.io.show(p)

m0 = 4.96 sq. µm
 r = 0.23 1/hours

There is some systematic error, it appears, with the data being consistently above the curve at short times and then below for intermediate times. This could be due to small changes in conditions over time, most likely delay in getting into exponential growth phase at short times.

Computing environment

[10]:
%load_ext watermark
%watermark -v -p numpy,pandas,skimage,bokeh,holoviews,panel,jupyterlab
Python implementation: CPython
Python version       : 3.9.12
IPython version      : 8.3.0

numpy     : 1.21.5
pandas    : 1.4.2
skimage   : 0.19.2
bokeh     : 2.4.2
holoviews : 1.14.8
panel     : 0.13.0
jupyterlab: 3.3.2