Lesson 44: Introduction to image processing with scikit-image
[1]:
import numpy as np
import pandas as pd
# Our image processing tools
import skimage.filters
import skimage.io
import skimage.measure
import skimage.morphology
import colorcet
import iqplot
import bootcamp_utils
import holoviews as hv
hv.extension('bokeh')
import bokeh.io
bokeh.io.output_notebook()
In this tutorial, we will learn some basic techniques for image processing using `scikit-image
<http://scikit-image.org>`__ with Python.
Image processing tools for Python
There are many image processing tools available for Python. Some of them, such as ITK and OpenCV are mature image processing packages that have bindings for Python, allowing easy use of their functionality. Others were developed specifically for Python. Some of the many packages are
Open CV (extensive computer vision package)
Cell Profiler (Broad Institute at MIT)
Insight Segmentation and Registration Toolkit (ITK, used in medical imaging, supported by the NIH)
The first two packages are standard with Anaconda. They provide a set of basic image processing tools, with more sophisticated packages such as ITK and Fiji supplying many more bells and whistles. If in the future you have more demanding image processing requirements, the other packages can prove very useful.
These days, there are lots of machine learning based packages for image segmentation, but few of these are mature packages at the moment. In future editions of the bootcamp, as these techniques and packages mature, we may use them.
We will almost exclusively use scikit-image
along with the standard tools from NumPy. The package scipy.ndimage
is quite useful, but we will use scikit-image
, since it has expanded functionality. A potential annoyance with skimage
is that the main package has minimal functionality, and you must import subpackages as needed. For example, to load and view images, you will need to import skimage.io
. Importantly, skimage
is well-documented, and you can access the
documentation at http://scikit-image.org/.
We will explore skimage
’s capabilities and some basic image processing techniques through example. In this lesson, we will take a brightfield and a fluorescent image of bacteria and perform segmentation, that is the identification of each pixel in the image as being bacterial or background.
Loading and viewing images
We will now load and view the test images we will use for segmentation. We load the image using the skimage.io.imread()
. The image is stored as a NumPy array. Each entry in the array is a pixel value. This is an important point: a digital image is data! It is a set of numbers with spatial positions.
Today, we’ll be looking at some images of Bacillus subtilis, a gram-positive bacterium famous for its ability to enter a form of “suspended animation” known as sporulation when environmental conditions get rough. In these images, all cells have been engineered to express Cyan Fluorescent Protein (CFP) once they enter a particular genetic state known as competence. These
cells have been imaged under phase contrast (bsub_100x_phase.tif
) and epifluorescence (bsub_100x_cfp.tif
) microscopy. These images were acquired by former Caltech graduate student (and 2016 bootcamp TA) Griffin Chure.
Let’s go ahead and load an image.
[2]:
# Load the phase contrast image.
im_phase = skimage.io.imread('data/bsub_100x_phase.tif')
# Take a look
im_phase
[2]:
array([[398, 403, 418, ..., 381, 377, 373],
[410, 400, 398, ..., 385, 372, 395],
[394, 407, 421, ..., 376, 377, 378],
...,
[371, 382, 380, ..., 389, 380, 370],
[362, 368, 356, ..., 397, 383, 382],
[372, 364, 372, ..., 385, 371, 378]], dtype=uint16)
We indeed have a NumPy array of integer values. To properly display images, we also need to specify the interpixel distance, the physical distance corresponding to neighboring pixels in an image. Interpixel distances are calibrated for an optical setup by various means. For this particular setup, the interpixel distance was 62.6 nm.
[3]:
# Store the interpixel distance in units of microns
ip_distance = 0.0626
Now that we have the image loaded, and know the interpixel distance, we would like to view it. Really, I should say “plot it” because, an image is data.
Downsampling the image
In almost any scientific application, we would proceed using the loaded image. However, to reduce the file size of this notebook for display on the internet, it is useful (in fact necessary, given GitHub’s file size limits) to downsample the image. If you want to work through the notebook on your machine with full-sized images, adjust downsample
to False
in the code cell below.
[4]:
downsample = True
if downsample:
im_phase = skimage.measure.block_reduce(im_phase, (2, 2), np.mean).astype(
im_phase.dtype
)
ip_distance /= 2
Viewing images with Bokeh
To view images in Bokeh, we use the p.image()
method. It expects a list of 2D arrays. The images in the list are overlaid. In practice, we usually only include a single image in the list.
Importantly, we can set the physical ranges of the x- and y-axes in the image display using the dw
and dh
kwargs, which respectively specify the width and height “data units” of the displayed image. For example, if the image is 50 by 100 microns, we can set dw = 100
and dh = 50
.
Finally, there is a color_mapper
kwarg, which sets how the image is colored. We will discuss this in more detail soon, but for now, we set the color mapper to be gray scale.
With this in mind, let’s show an image!
[5]:
# Get the physical scale of the image in terms of interpixel distance
dw = im_phase.shape[1] * ip_distance
dh = im_phase.shape[0] * ip_distance
# Set up figure, making sure x_range and y_range exactly capture image
p = bokeh.plotting.figure(
frame_width=int(im_phase.shape[1] / 2),
frame_height=int(im_phase.shape[0] / 2),
x_axis_label="µm",
y_axis_label="µm",
x_range=[0, dw],
y_range=[0, dh],
)
# Put in the image glyph; x and y set to origin
p.image(
image=[im_phase],
x=0,
y=0,
dw=dw,
dh=dh,
color_mapper=bokeh.models.LinearColorMapper(bokeh.palettes.gray(256)),
)
bokeh.io.show(p)
Display of images like this, though fairly straightforward, is automated in the bootcamp_utils.imshow()
function. This has other bells and whistles, including allowance for showing multichannel images and showing colorbars indicating pixel intensities.
[6]:
p = bootcamp_utils.imshow(
im_phase,
interpixel_distance=ip_distance,
length_units="µm",
color_mapper=bokeh.models.LinearColorMapper(bokeh.palettes.grey(256)),
colorbar=True,
frame_height=300,
)
bokeh.io.show(p)
Conveniently, images rendered using Bokeh allow for interactivity with zooming, etc. They also retain the axis ticks and labels, rendering scale bars unnecessary.
Viewing images with HoloViews
The Image
element of HoloViews enables easy viewing of images.
[7]:
hv.Image(
data=im_phase
)
[7]:
The image is displayed with HoloView’s default colormap (more on that in a moment), and is roughly square by default. The axes also are scaled to be of unit length. We would rather have the axes marked in units of microns so we know the physical distances. We can specify that with the bounds
kwarg for hv.Image
, which consists of a list of numerical values to use for the [left, bottom, right, top] of the image. We can make that for this image using our value of the interpixel distance.
[8]:
height_pixels, width_pixels = im_phase.shape
bounds = [0, 0, width_pixels*ip_distance, height_pixels*ip_distance]
The representations of the pixels are also not square (they need not be; an image is data!), but aesthetically, we prefer that. To enforce that, we need to set the width and height of the display. We can set the height, and then the width of the figure is computed from the height to give approximately square pixels. We will specify a rather small image, since later on we will compare images side by side and we want room for that.
[9]:
frame_height = 200
frame_width = im_phase.shape[1] * frame_height // im_phase.shape[0]
Finally, we might want to look at the image with a grayscale colormap.
[10]:
hv.Image(
data=im_phase,
bounds=bounds,
).opts(
frame_height=frame_height,
frame_width=frame_width,
cmap='gray',
xlabel='µm',
ylabel='µm',
)
[10]:
Lookup tables
In the above image representations, we used a gray colormap. Following are a few different colormaps we could use instead. As I discuss momentarily, you will almost always want to use Viridis.
[11]:
# Colormaps to check out, including a couple from colorcet
cmaps = [
bokeh.palettes.gray(256),
bokeh.palettes.viridis(256),
colorcet.fire,
colorcet.coolwarm,
]
cmap_names = ["gray", "Viridis", "fire", "coolwarm"]
# Build plots
plots = [
bootcamp_utils.imshow(
im_phase,
interpixel_distance=ip_distance,
length_units="µm",
color_mapper=bokeh.models.LinearColorMapper(cmap),
colorbar=True,
frame_height=200,
title=cmap_name,
)
for cmap_name, cmap in zip(cmap_names, cmaps)
]
bokeh.io.show(bokeh.layouts.gridplot(plots, ncols=2))