Lesson 45: Basic image quantification
[1]:
import numpy as np
# Our image processing tools
import skimage.filters
import skimage.io
import skimage.measure
import skimage.morphology
import skimage.segmentation
import bootcamp_utils
import colorcet
import bokeh.io
bokeh.io.output_notebook()
Now that we have learned how to do basic segmentation, we continue our image processing lessons to learn how to obtain quantitative data from images.
In this lesson, we will expand on what we learned in the first image processing tutorial and develop some further skills to help us with segmentation of images. The images we will use were acquired by Griffin Chure in Rob Phillips’s lab at Caltech. The bacteria are the HG105 E. coli strain, developed for an earlier paper from the Phillips lab. The strain is wild type, except the LacZYA genes are deleted. It features a YFP gene hooked up to the Lac promoter. Thus, the fluorescent signal is a measure of gene expression governed by the Lac promoter in the absence of several of its repressors.
In the previous lesson, we made strong arguments for performing segmentation on images of bacteria constitutively expressing a fluorescent protein. The main motivation is that the haloing effect from phase contrast images makes segmenting bacteria in large clumps very difficult. However, for the images we are analyzing here, we do not have a constitutive fluorescent protein. It is a bad idea to use the fluorescent signal we are measuring to do the segmentation, since this could introduce bias, especially when we have very low fluorescent signal for some cells. So, in this experiment, we will do the segmentation using the phase image and then use the fluorescent image to get a measure of the fluorescence intensity for each bacterium. It is important to have a dilute field of bacteria so that we do not have clumps of bacteria that make segmentation difficult.
Importantly, we are free to manipulate the brightfield image as we like in order to get good segmentation. After we have identified which pixels below to which cells, we have to be very careful adjusting the fluorescent images as the pixel values in these images are the signal we are measuring. We will only employ a median filter with a very small structuring element to deal with the known camera issue of occasional rogue high intensity pixels.
Inspecting the images
Let’s start by just getting a look at the images to see what we’re dealing with here. A collection of images can be found in data/HG105_images
which will be used for this lesson and the following practice section. The specific images we will be looking at here will be noLac_phase_0000.tif
and noLac_FITC_0000.tif
. Since we are not quantifying the shape of these cells, and for ease, we will not scale the axes and instead leave them in units of pixels.
We again downsample for web display. You can adjust the downsample
variable below if you are running a live notebook.
[2]:
# Load the images
im_phase = skimage.io.imread("data/HG105_images/noLac_phase_0004.tif")
im_fl = skimage.io.imread("data/HG105_images/noLac_FITC_0004.tif")
# Downsample if necessary
downsample = True
if downsample:
im_phase = skimage.measure.block_reduce(im_phase, (2, 2), np.mean).astype(
im_phase.dtype
)
im_fl = skimage.measure.block_reduce(im_fl, (2, 2), np.mean).astype(
im_fl.dtype
)
# Set up display options
height_pixels, width_pixels = im_phase.shape
bounds = [0, 0, *im_phase.shape]
frame_height = 200
frame_width = im_phase.shape[1] * frame_height // im_phase.shape[0]
kwargs = dict(
frame_height=200,
color_mapper=bokeh.models.LinearColorMapper(bokeh.palettes.viridis(256)),
x_axis_label='',
y_axis_label='',
)
# Display side-by-side
plots = [
bootcamp_utils.imshow(im_phase, **kwargs),
bootcamp_utils.imshow(im_fl, **kwargs),
]
bokeh.io.show(bokeh.layouts.gridplot(plots, ncols=2))
Argh. We can see a few issues in these images. First, it appears that the illumination in the phase contrast image is not uniform and is darker at the top than on the bottom. Additionally, we can see again a couple rogue pixels are ruining viewing the fluorescent image. They also present a problem with the phase image, so we’ll do a quick median filter on it to clean things up before we try to conquer the issue of uneven illumination.
[3]:
# Structuring element
selem = skimage.morphology.square(3)
# Perform median filter
im_phase_filt = skimage.filters.median(im_phase, footprint=selem)
im_fl_filt = skimage.filters.median(im_fl, footprint=selem)
# Show the images again
plots = [
bootcamp_utils.imshow(im_phase_filt, **kwargs),
bootcamp_utils.imshow(im_fl_filt, **kwargs),
]
bokeh.io.show(bokeh.layouts.gridplot(plots, ncols=2))
So, we will proceed to segment in phase and then use the fluorescence data to quantify expression levels. Before we begin our thresholding, however, we should correct our illumination issues in the phase contrast image which are likely caused by improper Köhler illumination.
Background Subtraction
We can correct for this non-uniform illumination by performing a Gaussian background subtraction. It is important for us to notice that the uneven illumination spans across a distance much greater than that of a single bacterium. In a Gaussian background subtraction, small variations in pixel value across the image are blurred using a two-dimensional Gaussian function leaving only large-scale variations in intensity. To correct for the non-uniformity of the illumination, we can simply subtract our Gaussian blurred image from our original. To help visualize how this works, let’s look at the Gaussian blurred image.
[4]:
# Apply a Gaussian blur with a 50 pixel radius.
im_phase_gauss = skimage.filters.gaussian(im_phase_filt, 50.0)
bokeh.io.show(bootcamp_utils.imshow(im_phase_gauss, **kwargs))
While our input image has a uint16
data type, our Gaussian filtered image is actually a float64
. This means in order to have any substantive effect, our original image must also be converted to a float64
before subtraction. We can use the skimage.img_as_float()
function to perform this conversion.
[5]:
# Convert median filtered image to floay
im_phase_float = skimage.img_as_float(im_phase_filt)
# Subtract our gaussian blurred image from the original.
im_phase_sub = im_phase_float - im_phase_gauss
# Look at the background subtracted image
bokeh.io.show(bootcamp_utils.imshow(im_phase_sub, **kwargs))
Voilà! The pixel values appear much more uniform across the background of the image, meaning our segmentation safari can begin.
Segmenting the bacteria
As we go about doing the segmentation, it will be instructive to zoom in on a region of the image with some interesting features. We define the indices of the subimage for use going forward. The np.s_
trick of NumPy is a useful way to do this. It makes a tuple that you can use to easily slice NumPy arrays (remember, images are represented as NumPy arrays when you load them).
[6]:
# Indices of subimage
if downsample:
slc = np.s_[0:175, 25:258]
else:
slc = np.s_[0:350, 50:517]
# Look at subimage
bokeh.io.show(bootcamp_utils.imshow(im_phase_sub[slc], **kwargs))
The image is littered with small dots. These are the result of condensation of water vapor on the glass in front of the CCD of the camera. This is a difficult issue in the particular lab space where this image was taken; there are issues with the air conditioning. (Yes, that’s right, the air conditioning can affect your data!)
Denoising the image
As we prepare for segmentation, we should begin by denoising the image. We already did a median filter to get rid of occasional bad pixels.
We may also want to get rid of all the dots on the camera glass. These are the little circles in the image. We can use a total variation filter for this purpose. These filters compute local derivatives in the image and try to limit them. The result is flatter, sometimes “cartoon-like” images. We will not use this going forward, but only show it here for illustrative purposes.
[7]:
# Perform a Chambolle total variation filter.
im_phase_tv = skimage.restoration.denoise_tv_chambolle(
im_phase_sub, weight=0.005)
# Look at result
plots = [
bootcamp_utils.imshow(im_phase_sub[slc], **kwargs),
bootcamp_utils.imshow(im_phase_tv[slc], **kwargs),
]
bokeh.io.show(bokeh.layouts.gridplot(plots, ncols=2))
Going forward, we will use the median filtered image.
Segmenting the bacteria with good ol’ thresholding
Assuming we have taken care of any illumination issues that would preclude us from using a global thresholding method, we will go ahead and use a global threshold on the phase images to separate bacteria from background. Note that this is one of many ways of segmenting the bacteria, and many more sophisticated and powerful methods are possible.
[8]:
# Compute Otsu threshold value for median filtered image
thresh_otsu = skimage.filters.threshold_otsu(im_phase_sub)
# Construct thresholded image
im_bw = im_phase_sub < thresh_otsu
# Display images
plots = [
bootcamp_utils.imshow(im_phase_filt, **kwargs),
bootcamp_utils.imshow(im_bw, **kwargs),
bootcamp_utils.imshow(im_phase_filt[slc], **kwargs),
bootcamp_utils.imshow(im_bw[slc], **kwargs),
]
bokeh.io.show(bokeh.layouts.gridplot(plots, ncols=2))
We see that we have some tiny particles that are not considered background. We will deal with those explicitly in a moment. Otherwise, we have effectively separated bacteria from background.
Since we will be quantifying fluorescent intensity in individual bacteria, we want to make sure only whole bacteria are considered. Therefore, we should clear off any bacteria that are touching the border of the image. There is a very convenient way to do this. The skimage.segmentation.clear_border()
function takes a binary (black and white image) and clear out any white “island” that touches a border of the image.
[9]:
# Clear border with 5 pixel buffer
im_bw = skimage.segmentation.clear_border(im_bw, buffer_size=5)
Quantifying the segmented regions
We have many white regions in the image, but only some of them are bacteria. We would like to be able to identify what is bacteria and what is noise. To this end, we need to compute the number of pixels in each white “island” in the thresholded image. We can use the skimage.measure.regionprops()
function to do this (and, as we will see, so much more!). Before we can use it, we have to label the image. That is, we need to assign a unique label to each white region in the thresholded
region.
[10]:
# Label binary image; background kwarg says value in im_bw to be background
im_labeled, n_labels = skimage.measure.label(im_bw, background=0, return_num=True)
# Show number of regions
print("Number of individual regions = ", n_labels)
# See result (one of the few times it's ok to use rainbow colormap!)
bokeh.io.show(
bootcamp_utils.imshow(
im_labeled,
frame_height=200,
color_mapper=bokeh.models.LinearColorMapper(colorcet.rainbow),
x_axis_label="",
y_axis_label="",
)
)
Number of individual regions = 21
We have 24 regions, but only 20 of these are bacteria, as we can see by hand counting. We want to eliminate the small ones, so we will calculate the area of all of the regions. Note that this is a very clean image and we still ended up picking small bits of stuff in the background. skimage.measure.regionprops()
computes the area of each region, in addition to lots of other useful statistics. Conveniently, through the intensity_image
keyword argument, we can specify a corresponding
intensity image, in this case the fluorescent image. skimage.measure.regionprops()
. When an intensity image is specified, skimage.measure.regionprops()
then also, among other things, average intensity for each of the regions. Ultimately, we want the average intensity.
[11]:
# Extract region props
im_props = skimage.measure.regionprops(
im_labeled,
intensity_image=im_fl_filt,
)
That’s it! That’s pretty much all there is to it! We now need to eliminate regions that are too small to be bacteria. First, let’s get an estimate for how big a bacterium is. We can get that estimate by zooming in on an image and counting pixels. I would estimate the bacteria are about 30 pixels long and 10 pixels wide, which is a bit of an underestimate. So, let’s say the cutoff for being a bacterium is about half that, or 150 pixels. We can use all of the data conveniently stored in
im_props
to clear up our image.
[12]:
# Make a filtered black and white image
im_bw_filt = im_labeled > 0
# Define cutoff size
cutoff = 150
# Loop through image properties and delete small objects
n_regions = 0
for prop in im_props:
if prop.area < cutoff:
im_bw_filt[im_labeled==prop.label] = 0
else:
n_regions += 1
# Show number of regions
print('Number of individual regions = ', n_regions)
# Look at result
bokeh.io.show(bootcamp_utils.imshow(im_bw_filt, **kwargs))
Number of individual regions = 7
So, considering only the larger features in the image, we have found the bacteria.
Now, there is still the issue of regions that contain two bacteria. Again, we look at our zoomed in region.
[13]:
# Show zoomed in image
bokeh.io.show(bootcamp_utils.imshow(im_bw_filt[slc], **kwargs))
The bacteria in the upper left appear to be two different bacteria that may or may not be sisters. The region at approximately to the right is either a bacterium that is dividing, or has just divided. Depending on your experiment, you may want to treat these as a single bacterium or two. For this lesson, we will try to eliminate cells that are side-by-side and only keep the “lonely” cells.
So, we can test for bacteria that are side-by-side, versus those that are in line with each other, as would be the case for a dividing bacterium. We can use the eccentricity measure of the region. According to the skimage documentation,
Eccentricity of the ellipse that has the same second-moments as the region. The eccentricity is the ratio of the distance between its minor and major axis length. The value is between 0 and 1.
So, we only want objects with large eccentricity, say above 0.85.
[14]:
# Loop through image properties and delete small objects and round objects
n_regions = 0
for prop in im_props:
if prop.area < cutoff or prop.eccentricity < 0.85:
im_bw_filt[im_labeled==prop.label] = 0
else:
n_regions += 1
# Show number of regions
print('Number of individual regions = ', n_regions)
# Look at result
bokeh.io.show(bootcamp_utils.imshow(im_bw_filt[slc], **kwargs))
Number of individual regions = 5
Voila! Now, we can go ahead and compute the mean intensity of all regions we are interested in.
[15]:
# Initialize list of intensities of individual bacteria
mean_intensity = []
# Loop through regions and compute mean intensity of bacteria
for prop in im_props:
if prop.area > cutoff and prop.eccentricity > 0.8:
mean_intensity.append(prop.mean_intensity)
# Convert list to NumPy array
mean_intensity = np.array(mean_intensity)
# Take a look
mean_intensity
[15]:
array([293.0106383 , 240.75816993, 205.70394737, 441.13106796,
221.34254144, 235.70481928])
The mean intensities differ by about a factor of two from the lowest to the highest.
Overlay of fluorescent image
Finally, we can overlay our fluorescent image with the phase contrast image. We will take the fluorescent color to be green, and ranging from zero to one.
[16]:
# Build RGB image by stacking grayscale images
im_rgb = np.dstack(3 * [im_phase_filt / im_phase_filt.max()])
# Saturate one channel on bacteria
im_rgb[:, :, 1] = im_fl_filt / im_fl_filt.max()
# Show the result
bokeh.io.show(bootcamp_utils.imshow(im_rgb, color_mapper='rgb'))
Beautiful!
Computing environment
[17]:
%load_ext watermark
%watermark -v -p numpy,skimage,bokeh,holoviews,bootcamp_utils,jupyterlab
Python implementation: CPython
Python version : 3.9.12
IPython version : 8.3.0
numpy : 1.21.5
skimage : 0.19.2
bokeh : 2.4.2
holoviews : 1.14.8
bootcamp_utils: 0.0.7
jupyterlab : 3.3.2