Lesson 35: Basic image quantification

This tutorial was generated from a Jupyter notebook. You can download the notebook here.

In [1]:
# Our workhorse
import numpy as np

# Our image processing tools
import skimage.filters
import skimage.io
import skimage.measure
import skimage.morphology
import skimage.segmentation

# This is how we import the module of Matplotlib we'll be using
import matplotlib.pyplot as plt

# Seaborn makes plots pretty!
import seaborn as sns

# Magic function to make matplotlib inline; other style specs must come AFTER
%matplotlib inline

# This enables SVG graphics inline (only use with static plots (non-Bokeh))
%config InlineBackend.figure_format = 'svg'

# Set JB's favorite Seaborn settings
rc={'lines.linewidth': 2, 'axes.labelsize': 18, 'axes.titlesize': 18, 
    'axes.facecolor': 'DFDFE5'}
sns.set_context('notebook', rc=rc)

Now that we have learned how to do basic segmentation, we continue our image processing lessons to learn how to obtain quantitative data from images.

In this lesson, we will expand on what we learned in the first image processing tutorial and develop some further skills to help us with segmentation of images. The images we will use were acquired by Griffin Chure in Rob Phillips's lab here at Caltech. The bacteria are the HG104 E. coli strain, developed for an earlier paper from the Phillips lab. The strain is wild type, except the LacZYA genes are deleted. It features a YFP gene hooked up to the Lac promoter. Thus, the fluorescent signal is a measure of gene expression governed by the Lac promoter in the absense of several of its repressors.

In the previous lesson, we made strong arguments for performing segmentation on images of bacteria constitutively expressing a fluorescent protein. The main motivation is that the haloing effect from phase contrast images make segmenting bacteria in large clumps very difficult. However, for the images we are analyzing here, we do not have a constitutive fluorescent protein. It is a bad idea to use the fluorescent signal we are measuring to do the segementation, since this could introduce bias, especially when we have very low fluorescent signal for some cells. So, in this experiment, we will do the segmentation using the phase image and then use the fluorescent image to get a measure of the fluorescence intensity for each bacterium. It is important to have a dilute field of bacteria so that we do not have clumps of bacteria that mage segementation difficult.

Importantly, we are free to manipulate the brightfield image as we like in order to get good segmentation. After we have identified which pixels below to which cells, we have to be very careful adjusting the fluorescent images as the pixel values in these images are the signal we are measuring. We will only employ a median filter with a very small structuring element to deal with the known camera issue of occasional rogue high intensity pixels.

Inspecting the images

Let's start by just getting a look at the images to see what we're dealing with. You can download the phase image here and the fluorescent image here.

In [2]:
# Load the images
im_phase = skimage.io.imread('../data/HG104_phase.tif')
im_fl = skimage.io.imread('../data/HG104_FITC.tif')

# Display side-by-side
with sns.axes_style('dark'):
    fig, ax = plt.subplots(1, 2, figsize=(9.5, 8))
    ax[0].imshow(im_phase, cmap=plt.cm.gray)
    ax[1].imshow(im_fl, cmap=plt.cm.gray)

Argh. We see again that a couple rogue pixels are ruining viewing the fluorescent image. They also present a problem with the phase image, so we'll do a quick median filter on it to clean things up.

In [3]:
# Structuring element
selem = skimage.morphology.square(3)

# Perform median filter
im_phase_filt = skimage.filters.median(im_phase, selem)
im_fl_filt = skimage.filters.median(im_fl, selem)

# Show the images again
with sns.axes_style('dark'):
    fig, ax = plt.subplots(1, 2, figsize=(9.5, 8))
    ax[0].imshow(im_phase_filt, cmap=plt.cm.gray)
    ax[1].imshow(im_fl_filt, cmap=plt.cm.gray)

So, we will proceed to segment in phase and then use the fluorescence data to quantify expression levels.

Segmenting the bacteria

As we go about doing the segmentation, it will be instructive to zoom in on a region of the image with some interesting features. We define the indices of the subimage for use going forward.

In [4]:
# Indices of subimage
r, c = np.indices((200, 300)) + 550

# Look at subimage
with sns.axes_style('dark'):
    plt.imshow(im_phase_filt[r, c], cmap=plt.cm.gray)

The image is littered with small dots. These are the result of condensation of water vapor on the glass in front of the CCD of the camera. This is a difficult issue in the particular lab space where this image was taken; there are issues with the air conditioning. (Yes, that's right, the air conditioning can affect your data!)

Denoising the image

As we prepare for segementation, we should begin by denoising the image. We already did a median filter to get rid of occasional bad pixels.

We may also want to get rid of all the dots on the camera glass. These are the little circles in the image. We can use a total variation filter for this purpose. These filters compute local derivatives in the image and try to limit them. The result is flatter, sometimes "cartoon-like" images. We will not use this going forward, but only show it here for illustrative purposes.

In [5]:
# Perform a Chambolle total variation filter.
im_phase_tv = skimage.restoration.denoise_tv_chambolle(
                                            im_phase_filt, weight=0.005)

# Look at result
with sns.axes_style('dark'):
    fig, ax = plt.subplots(1, 2, figsize=(9.5, 8))
    ax[0].imshow(im_phase_filt[r, c], cmap=plt.cm.RdBu_r)
    ax[1].imshow(im_phase_tv[r, c], cmap=plt.cm.RdBu_r)

Going forward, we will use the median filtered image.

Segmenting the bacteria with good ol' thresholding

Assuming we have taken care of any illumination issues that would preclude us from using a global thresholding method, we will go ahead and use a global threshold on the phase images to separate bacteria from background. Note that this is one of many ways of segmenting the bacteria, and many more sophisticated and powerful methods are possible.

In [6]:
# Compute Otsu threshold value for median filtered image
thresh_otsu = skimage.filters.threshold_otsu(im_phase_filt)

# Construct thresholded image
im_bw = im_phase_filt < thresh_otsu

# Display images
with sns.axes_style('dark'):
    fig, ax = plt.subplots(2, 2, figsize=(9.5, 8))
    ax[0,0].imshow(im_phase_filt, cmap=plt.cm.gray)
    ax[0,1].imshow(im_bw, cmap=plt.cm.gray)
    ax[1,0].imshow(im_phase_filt[r, c], cmap=plt.cm.gray)
    ax[1,1].imshow(im_bw[r, c], cmap=plt.cm.gray)

We see that we have lots of tiny particles that are not considered background. We will deal with those explicitly in a moment. Otherwise, we have effectively separated bacteria from background.

Since we will be quantifying fluorescent intensity in individual bacteria, we want to make sure only whole bacteria are considered. Therefore, we should clear off any bacteria that are touching the border of the image. There is a very convenient way to do this.

In [7]:
# Clear border with 5 pixel buffer
im_bw = skimage.segmentation.clear_border(im_bw, buffer_size=5)

Quick aside: median filter fluorescent image

Because we will be using the fluorescent image soon, we'll first perform a median filter to make sure we don't have any rogue pixels.

In [8]:
# Load fluorescent image
im_fl = skimage.io.imread('../data/HG104_FITC.tif')

# Perform median filter on fluroscent image
im_fl_filt = skimage.filters.median(im_fl, selem)

Quantifying the segmented regions

We have many white regions in the image, but only some of them are bacteria. We would like to be able to identify what is bacteria and what is noise. To this end, we need to compute the number of pixels in each white "island" in the thesholded image. We can use the skimage.measure.regionprops function to do this (and, as we will see, so much more!). Before we can use it, we have to label the image. That is, we need to assign a unique label to each white region in the thresholded region.

In [9]:
# Label binary image; background kwarg says value in im_bw to be background
im_labeled, n_labels = skimage.measure.label(
                            im_bw, background=0, return_num=True)

# Increment labels by one for future compatibility
im_labeled += 1

# See result (one of the few times it's ok to use rainbow colormap!)
with sns.axes_style('dark'):
    plt.imshow(im_labeled, cmap=plt.cm.rainbow)

# Show number of regions
print('Number of individual regions = ', n_labels)
Number of individual regions =  202

We have 351 regions, but only 17 of these are bacteria, as we can see by hand counting. We want to eliminate the small ones, so we will calculate the area of all of the regions. skimage.measure.regionprops computes the area of each region, in addition to lots of other useful statistics. Conveniently, through the intensity_image keyword argument, we can specify a corresponding intensity image, in this case the fluorescent image. skimage.measure.regionprops then also computes, among other things, average intensity for each of the regions. Ultimately, we want the integrated intensity, which is just the average intensity times the area.

Note also that we incremented the labels by one. This is because in skimage version 0.11.*, the background is labeled as -1, and the first non-background region is labeled as 0. This changes in future versions of skimage such that the background is labeled 0. skimage.measure.regionprops is also expecting background to be labeled zero, with other labels starting at 1, which is why we incremented the labels by 1.

In [10]:
# Extract region props
im_props = skimage.measure.regionprops(im_labeled, intensity_image=im_fl_filt)

That's it! That's pretty much all there is to it! We now need to eliminate regions that are too small to be bacteria. First, let's get an estimate for how big a bacterium is. We'll look again at our zoomed image.

In [11]:
# Show zoomed in image
with sns.axes_style('dark'):
    plt.imshow(im_phase_filt[r, c], cmap=plt.cm.gray)

Based on the image, I would estimate the bacteria are about 30 pixels long and 10 pixels wide, which is a bit of an underestimate. So, let's say the cutoff for being a bacterium is about half that, or 150 pixels. We can use all of the data conveniently stored in im_props to clear up our image.

In [12]:
# Make a filtered black and white image
im_bw_filt = im_labeled > 0

# Define cutoff size
cutoff = 150

# Loop through image properties and delete small objects
n_regions = 0
for prop in im_props:
    if prop.area < cutoff:
        im_bw_filt[im_labeled==prop.label] = False
        n_regions += 1
# Look at result
with sns.axes_style('dark'):
    plt.imshow(im_bw_filt, cmap=plt.cm.gray)

# Show number of regions
print('Number of individual regions = ', n_regions)
Number of individual regions =  17

So, considering only the larger features in the image, we have found the bacteria.

Now, there is still the issue of regions that contain two bacteria. Again, we look at our zoomed in region.

In [13]:
# Show zoomed in image
with sns.axes_style('dark'):
    plt.imshow(im_bw_filt[r, c], cmap=plt.cm.gray)

The bacteria in the upper right are clearly two different bacteria and may or may not be sisters. However, the second region from the left is either a bacterium that is dividing, or has just divided. Either way, you might still want to keep this as a single bacterium.

So, we can test for bacteria that are next to each other, versus those that are in line with each other, as would be the case for a dividing bacterium. We can use the eccentricity measure of the region. According to the skimage documentation,

Eccentricity of the ellipse that has the same second-moments as the region. The eccentricity is the ratio of the distance between its minor and major axis length. The value is between 0 and 1.

So, we only want objects with large eccentricity, say above 0.8.

In [14]:
# Loop through image properties and delete small objects and round objects
n_regions = 0
for prop in im_props:
    if prop.area < cutoff or prop.eccentricity < 0.8:
        im_bw_filt[im_labeled==prop.label] = False
        n_regions += 1
# Look at result
with sns.axes_style('dark'):
    plt.imshow(im_bw_filt[r, c], cmap=plt.cm.gray)

# Show number of regions
print('Number of individual regions = ', n_regions)
Number of individual regions =  16

Voila! Now, we can go ahead and compute the summed intensity of all regions we are interested in.

In [15]:
# Initialize list of intensities of individual bacteria
int_intensity = []

# Loop through regions and compute integrated intensity of bacteria
for prop in im_props:
    if prop.area > cutoff and prop.eccentricity > 0.8:
        int_intensity.append(prop.area * prop.mean_intensity)

# Convert list to NumPy array
int_intensity = np.array(int_intensity)

# Take a look
array([  78067.,   69420.,  103335.,   95476.,   87503.,   88453.,
         99484.,   99907.,   77948.,   98107.,  139362.,   98159.,
         79922.,   73701.,  110855.,  107903.])

The integrated intensities differ by about a factor of two from the lowest to the highest.

Overlay of fluorescent image

Finally, we can overlay our fluorescent image with the phase contrast image. We will take the fluorescent color to be green, and ranging from zero to one.

In [16]:
# Build RGB image by stacking grayscale images
im_rgb = np.dstack(3 * [im_phase_filt / im_phase_filt.max()])

# Substitute the green c
im_rgb[im_bw_filt, 1] = im_fl_filt[im_bw_filt] / im_fl_filt.max()

# Show the result
with sns.axes_style('dark'):
    fig, ax = plt.subplots(1, 2, figsize=(9.5, 8))
    ax[1].imshow(im_rgb[r, c, :]);