Exercise 5.10: Growth curves from a movie
At the dawn of the molecular revolution in biology, key experiments by Jacques Monod in which he measured growth curves of bacteria under different conditions exposed some of the mechanisms of regulation of gene expression. Those growth curves were measured in a bulk solution. In this exercise, we will measure bacterial growth starting from two bacteria. The movie shows Bacillus subtilis constitutively expressing mCherry growing under slow growth conditions. These data were kindly donated by Jin Park from the Elowitz lab at Caltech.
a) Load in the series of images contained in the directory ~git/bootcamp/data/bacterial_growth/
. Be sure that however you store them (a list or tuple or other object) has the frames in the proper order.
b) Segment the images to separate bacteria from background. You do not need to segment individual bacteria; this would likely require some more advanced techniques involving edge detection that we haven’t covered in bootcamp.
c) Show a representative image from the stack (with the segmentation overlayed) of images. Be sure to check out the README
file in the directory containing the images to get the interpixel distance.
d) If \(m\) is the mass of bacteria, which is proportional to the area in the images, then, for exponential growth,
\begin{align} m(t) = m_0\,\mathrm{e}^{rt}, \end{align}
where \(r\) is the growth rate. Taking the logarithm of both sides gives
\begin{align} \ln m(t) = \ln m_0 + rt. \end{align}
So, the slope of the line on a log-log scale is the growth rate.
With this in mind, plot the bacterial area as a function of time with the area on a log scale. If the plot is linear, the bacteria are performing exponential growth.
e) To get the slope of the line, you can use the np.polyfit()
function and perform the regression on a semilog scale. Alternatively, you can fit the exponential function directly using scipy.optimize.curve_fit()
. Once you find the growth rate (and intercept), plot the theoretical growth curve along with the data.
Solution
[1]:
import copy
import os
import glob
import numpy as np
import pandas as pd
import skimage.io
import skimage.morphology
import skimage.segmentation
import skimage.measure
import holoviews as hv
hv.extension('bokeh')
notebook_url = 'localhost:8888'
import bokeh.io
bokeh.io.output_notebook()
a) We will load the images into a tuple, where the index of the tuple corresponds to the frame number. Under normal circumstances, we would do data validation to make sure there are no skipped frames, but in the interest of brevity, we will proceed assuming all data are properly labeled and organized.
[2]:
# Use a list comprehension to read in the images
ims = [
skimage.io.imread(fname)
for fname in sorted(glob.glob("data/bacterial_growth/bacillus_*.tif"))
]
# Store it as a tuple so we don't mess with it
ims = tuple(ims)
# How many?
print("There are", len(ims), "images.")
There are 55 images.
To visualize the images before we get going, and also to visualize our success in segmentation, we can build a dashboard that shows the images and optionally an overlay of the segmentation mask. We first need a function to create the overlay.
[3]:
def make_overlay(im, im_bw):
"""Create RGB image with cyan channel saturated via segmentation mask."""
# Make float and normalized image
im = np.copy(im).astype(float)
# Build RGB image by stacking grayscale images
im_rgb = np.dstack(3 * [im / im.max()])
# Saturate cyan channel wherever there are white pixels in thresh image
im_rgb[im_bw, 0] = 1.0
return im_rgb
And now we can write a function to make the dashboard. It would be easier to make the dashboard and display it here. It will get automatically updated whenever we change the thresholding. So, after updating our thresholding method (with thresholded images stored in ims_bw
), we should return to the dashboard to check it out.
[5]:
# Make thresholded images
ims_bw = [im > skimage.filters.threshold_otsu(im) for im in ims]
# PArameters for the display of images
ip_distance = 0.0645
dt = 15
frame_width = 200
frame_height = frame_width * ims[0].shape[0] // ims[1].shape[1]
bounds = [
0,
0,
ims[0].shape[1] * ip_distance,
ims[0].shape[0] * ip_distance,
]
opts = dict(
xlabel="µm",
ylabel="µm",
frame_width=frame_width,
frame_height=frame_height,
)
# Slider for frame number
frame_slider = bokeh.models.Slider(title="frame", start=1, end=55, value=1, step=1)
# Build initial plot
frame = frame_slider.value
title = f"t = {dt*(frame-1)} min"
im = ims[frame - 1]
im_rgb = make_overlay(im, ims_bw[frame - 1])
p_overlay = hv.render(hv.RGB(im_rgb, bounds=bounds).opts(title="overlay", **opts))
p_phase = hv.render(
hv.Image(im, bounds=bounds).opts(title=title, cmap="viridis", **opts)
)
p_phase.x_range = p_overlay.x_range
p_phase.y_range = p_overlay.y_range
def callback(attr, old, new):
frame = frame_slider.value
title = f"t = {dt*(frame-1)} min"
im = ims[frame - 1]
im_rgb = make_overlay(im, ims_bw[frame - 1])
dummy_p_overlay = hv.render(hv.RGB(im_rgb, bounds=bounds).opts(title=title, **opts))
dummy_p = hv.render(
hv.Image(im, bounds=bounds).opts(title="overlay", cmap="viridis", **opts)
)
p_overlay.renderers[0].data_source.data = dict(
dummy_p_overlay.renderers[0].data_source.data
)
p_phase.renderers[0].data_source.data = dict(dummy_p.renderers[0].data_source.data)
# Link callback
frame_slider.on_change("value", callback)
# Build layout
layout = bokeh.layouts.column(
bokeh.layouts.column(
frame_slider,
bokeh.layouts.row(p_phase, p_overlay),
)
)
def app(doc):
doc.add_root(layout)
# Away we go!
bokeh.io.show(app)
b) For segmentation, we will take a similar approach as in Lesson 44 to segment the image. We will simply threshold the image using Otsu’s method. The dashboard above demonstrates that this works well.
c) See the dashboard above.
The segmentation overshoots the bacteria. We might want to try to refine this a bit. We could do adaptive thresholding. We perform automatic thresholding on subimages throughout the image. We will take subimages to be 51 \(\times\) 51, since that would span the width of one bacterium, and then some, thereby avoiding the issue of thresholding within a single bacterium (and adaptive thresholding needs an odd image block size). Adaptive thresholding is achieved using
skimage.filters.threshold_local()
. Let’s give it a shot.
[6]:
# List of thresholded images
ims_bw = [im > skimage.filters.threshold_local(im, 51) for im in ims]
Checking out the dashboard again…. Whoa! It did a great job with the bacteria, but the background got really messed up. We could include pixels that are unity in both the adaptive and Otsu thresholding to take care of this. We then use np.logical_and()
to do an element-by-element AND
operation on the arrays.
[7]:
def thresh(im, block_size=51):
"""Combined adaptive and Otsu thresholding."""
thresh_otsu = skimage.filters.threshold_otsu(im)
im_bw = im > thresh_otsu
return np.logical_and(im > skimage.filters.threshold_local(im, block_size), im_bw)
# Threshold images
ims_bw = [thresh(im) for im in ims]
Now looking at the dashboard, it is very nice!
d) To plot the growth curve, we can plot the total bacterial area on the \(y\)-axis (on a log scale), versus time on the \(x\)-axis. To get the total area, we need to compute the total number of “bacterial” pixels in a given image, multiplied by the pixel area, which is \((64.5\text{ nm})^2 = 4160.25\text{ nm}^2\). Also, according to the metadata, we have one frame every 15 minutes.
[8]:
# Compute pixel area
pixel_area = 0.0645**2
# Get total bacterial area
bac_area = pixel_area * np.array([np.sum(im_bw) for im_bw in ims_bw])
# Get time in units of hours
t = 0.25 * np.arange(len(ims_bw))
# Plot the result
p = bokeh.plotting.figure(
height=250,
width=400,
y_axis_type='log',
x_axis_label='time (hours)',
y_axis_label='bacterial area (µm²)'
)
p.scatter(
x=t,
y=bac_area
)
bokeh.io.show(p)
This looks pretty linear on the semilog plot, suggesting that the bacteria are, indeed, performing exponential growth.
e) To perform the regression on the growth curve, we use np.polyfit()
with the x
values being time, and the y
values being the natural log of the bacterial area. We fit with a first order polynomial, which is a line.
[9]:
# Use polyfit to get slope and intercept
slope, intercept = np.polyfit(t, np.log(bac_area), 1)
# Pull out and print parameters
print("""
m0 = {0:.2f} sq. µm
r = {1:.2f} 1/hours
""".format(np.exp(intercept), slope))
# Generate smooth curve
t_smooth = np.linspace(0, t.max(), 200)
y_smooth = np.exp(intercept + slope * t_smooth)
# Add to the plot
p.line(
x=t_smooth,
y=y_smooth,
color='orange',
line_width=2,
)
bokeh.io.show(p)
m0 = 4.96 sq. µm
r = 0.23 1/hours
There is some systematic error, it appears, with the data being consistently above the curve at short times and then below for intermediate times. This could be due to small changes in conditions over time, most likely delay in getting into exponential growth phase at short times.
Computing environment
[10]:
%load_ext watermark
%watermark -v -p numpy,pandas,skimage,bokeh,holoviews,jupyterlab
Python implementation: CPython
Python version : 3.11.9
IPython version : 8.20.0
numpy : 1.26.4
pandas : 2.2.1
skimage : 0.22.0
bokeh : 3.4.1
holoviews : 1.18.3
jupyterlab: 4.0.13