Exercise 6.2: Writing your own segmentation function

As you probably noticed during the previous two lessons, there are often a lot of small operations that are sometimes necessary before you can even extract the useful data from the image! Rather than writing out each step whenever you want to process an image, you should write a boilerplate function that can be used to segment any phase contrast image of bacteria. Your function should execute the following steps.

Correct for “hot” or “bad” pixels in an image.

Correct for uneven illumination.

Perform a thresholding operation.

Remove bacteria or objects near/touching the image border.

Remove objects that are too large (or too small) to be bacteria. Think carefully! For a multipurpose function, would you always want the same area cutoff?

Remove improperly segmented cells.

Return a labeled segmentation mask.

A set of discrete operations like this is called a pipeline. It is usefull to think of image processing, or data processing for that matter, as a pipeline consisting of successive steps. This helps codify how you are extracting conclusions from your data and improves reproducibility.

After putting your pipeline into a function, run the pipeline on both the Bacillus subtilis (Lesson 44) and the E. coli Lesson 45 image sets. Does your function work well in both cases? Note that in both of these image sets, the interpixel distance is 0.0626 µm per pixel.

Solution

Many of our image processing functions will come from scikit-image. The function below calls on functions from these packages directly. This is a wonderful example of the power of modular programming—each operation performs a single task!

[1]:

import numpy as np

import skimage.io
import skimage.segmentation
import skimage.morphology

import holoviews as hv
hv.extension('bokeh')

[2]:

def cell_segmenter(
    im,
    thresh="otsu",
    radius=20.0,
    image_mode="phase",
    area_bounds=(0, 1e7),
    ecc_bounds=(0, 1),
):
    """
    This function segments a given image via thresholding and returns
    a labeled segmentation mask.

    Parameters
    ----------
    im : 2d-array
        Image to be segmented. This may be of either float or integer
        data type.
    thresh : int, float, or 'otsu'
        Value used during thresholding operation. This can either be a value
        (`int` or `float`) or 'otsu'. If 'otsu', the threshold value will be
        determined automatically using Otsu's thresholding method.
    radius : float
        Radius for gaussian blur for background subtraction. Default value
        is 20.
    image_mode : 'phase' or 'fluorescence'
        Mode of microscopy used to capture the image. If 'phase', objects with
        intensity values *lower* than the provided threshold will be selected.
        If `fluorescence`, values *greater* than the provided threshold will be
        selected. Default value is 'phase'.
    area_bounds : tuple of ints.
        Range of areas of acceptable objects. This should be provided in units
        of square pixels.
    ecc_bounds : tuple of floats
        Range of eccentricity values of acceptable objects. These values should
        range between 0.0 and 1.0.

    Returns
    -------
    im_labeled : 2d-array, int
        Labeled segmentation mask.
    """
    # Apply a median filter to remove hot pixels.
    med_selem = skimage.morphology.square(3)
    im_filt = skimage.filters.median(im, footprint=med_selem)

    # Perform gaussian subtraction
    im_sub = bg_subtract(im_filt, radius)

    # Determine the thresholding method.
    if thresh == "otsu":
        thresh = skimage.filters.threshold_otsu(im_sub)

    # Determine the image mode and apply threshold.
    if image_mode == "phase":
        im_thresh = im_sub < thresh
    elif image_mode == "fluorescence":
        im_thresh = im_sub > thresh
    else:
        raise ValueError(
            "image mode not recognized. Must be 'phase'" + " or 'fluorescence'"
        )

    # Label the objects.
    im_label = skimage.measure.label(im_thresh)

    # Apply the area and eccentricity bounds.
    im_filt = area_ecc_filter(im_label, area_bounds, ecc_bounds)

    # Remove objects touching the border.
    im_border = skimage.segmentation.clear_border(im_filt, buffer_size=5)

    # Relabel the image.
    im_border = im_border > 0

    return skimage.measure.label(im_border)


def bg_subtract(im, radius):
    """
    Subtracts a gaussian blurred image from itself smoothing uneven
    illumination.

    Parameters
    ----------
    im : 2d-array
        Image to be subtracted
    radius : int or float
        Radius of gaussian blur

    Returns
    -------
    im_sub : 2d-array, float
        Background subtracted image.
    """
    # Apply the gaussian filter.
    im_filt = skimage.filters.gaussian(im, radius)

    # Ensure the original image is a float
    if np.max(im) > 1.0:
        im = skimage.img_as_float(im)

    return im - im_filt


def area_ecc_filter(im, area_bounds, ecc_bounds):
    """
    Filters objects in an image based on their areas.

    Parameters
    ----------
    im : 2d-array, int
        Labeled segmentation mask to be filtered.
    area_bounds : tuple of ints
        Range of areas in which acceptable objects exist. This should be
        provided in units of square pixels.
    ecc_bounds : tuple of floats
        Range of eccentricities in which acceptable objects exist. This should be
        provided on the range of 0 to 1.0.

    Returns
    -------
    im_relab : 2d-array, int
        The relabeled, filtered image.
    """

    # Extract the region props of the objects.
    props = skimage.measure.regionprops(im)

    # Extract the areas and labels.
    areas = np.array([prop.area for prop in props])
    eccs = np.array([prop.eccentricity for prop in props])
    labels = np.array([prop.label for prop in props])

    # Make an empty image to add the approved cells.
    im_approved = np.zeros_like(im)

    # Threshold the objects based on area and eccentricity
    for i, _ in enumerate(areas):
        if (
            areas[i] > area_bounds[0]
            and areas[i] < area_bounds[1]
            and eccs[i] > ecc_bounds[0]
            and eccs[i] < ecc_bounds[1]
        ):
            im_approved += im == labels[i]

    # Relabel the image.
    return skimage.measure.label(im_approved > 0)

In the previous code block, we defined a few functions. I could have thrown all components into a single function, but keeping modularity is important. The filtering functions will make application of area and eccentricity bounds easier especially if I want to use them again in the future. The important function here is cell_segmenter() which actually calls the two filtering functions we wrote above! Let’s try them out on some test images.

[3]:

# Load a B. subtilis and E. coli test image.
ecoli = skimage.io.imread("data/HG105_images/noLac_phase_0004.tif")
bsub_phase = skimage.io.imread("data/bsub_100x_phase.tif")
bsub_fluo = skimage.io.imread("data/bsub_100x_cfp.tif")

# Using my knowledge of biology, we can draw some bounds.
ip_dist = 0.0626  # in units of µm per pixel.
area_bounds = (1 / ip_dist ** 2, 10.0 / ip_dist ** 2)
ecc_bounds = (0.8, 1.0)  # they are certainly not spheres.

# Pass all images through our function.
ecoli_seg = cell_segmenter(
    ecoli, area_bounds=area_bounds, ecc_bounds=ecc_bounds
)
bsub_phase_seg = cell_segmenter(
    bsub_phase,
    image_mode="phase",
    area_bounds=area_bounds,
    ecc_bounds=ecc_bounds,
)
bsub_fluo_seg = cell_segmenter(
    bsub_fluo,
    image_mode="fluorescence",
    area_bounds=area_bounds,
    ecc_bounds=ecc_bounds,
)

We can make merged images to see how well we segmented all of the objects. Since this would be hard to see on the Bacillus fluorescence image directly, we’ll take a look at the fluorescence segmentation on the corresponding phase image.

[4]:

# Make the two phase images as floats and copy them.
ecoli_float = ecoli / ecoli.max()
bsub_float = bsub_phase / bsub_phase.max()
ecoli_copy = np.copy(ecoli_float)
bsub_copy = np.copy(bsub_float)
bsub_copy2 = np.copy(bsub_float)

# Mark the segmented bacteria on the copied images.
ecoli_copy[ecoli_seg > 0] = 0.8
bsub_copy[bsub_phase_seg > 0] = 0.8
bsub_copy2[bsub_fluo_seg > 0] = 0.8

# Merge them into RGB images.
ecoli_merge = np.dstack((ecoli_copy, ecoli_float, ecoli_float))
bsub_phase_merge = np.dstack((bsub_copy, bsub_float, bsub_float))
bsub_fluo_merge = np.dstack((bsub_copy2, bsub_float, bsub_float))

# Let's plot 'em!
frame_width = 200
frame_height = int(frame_width * ecoli.shape[0] / ecoli.shape[1])
opts = dict(
    frame_height=frame_height, frame_width=frame_width, xlabel="", ylabel=""
)
plots = [
    hv.RGB(im).opts(**opts)
    for im in [ecoli_merge, bsub_phase_merge, bsub_fluo_merge]
]

hv.Layout(plots).cols(1)

[4]:

That’s actually pretty good! It’s obvious that segmenting in fluorescence yields a better mask than in phase. Keep that in mind for your own experiments!

Computing environment

[5]:

%load_ext watermark
%watermark -v -p numpy,skimage,bokeh,holoviews,jupyterlab

Python implementation: CPython
Python version       : 3.9.12
IPython version      : 8.3.0

numpy     : 1.21.5
skimage   : 0.19.2
bokeh     : 2.4.2
holoviews : 1.14.8
jupyterlab: 3.3.2