Lesson 19: Introduction to scripting

This tutorial was generated from a Jupyter notebook. You can download the notebook here.

In [2]:
# We just need modules from the standard library
import datetime
import glob
import os
import re
import shutil
import subprocess

Scripting is a way to automate your tasks. For example, let's say you are a lawyer and you need to redact a bunch of text documents. Two possible ways to do this are:

  1. Open each document on your computer. Search for the text "Jeffrey Lebowski." Wherever you find it, change it to "xxxxxxx." Save the document.
  2. Tell your assitant to open each document on your computer. Search for the text "Jeffrey Lebowski." Wherever he finds it, change it to "xxxxxxx." Save the document.

I suspect that most of you would choose option 2. Scripting is basically doing option 2, except your assistant is your computer.

Because of its simple syntax and modules to parse text and communicate with the operating system in its standard library, Python is a good scripting language. In this tutorial, we will write some Python scripts to do tasks we might encounter in biology. (We already saw some examples of some of the text processing in previous tutorials.)

Our example scripting problem

As tends to be our philosophy, we will learn some scripting procedures by example. In the example we consider, we will parse a directory of images coming from a Leica SP2 confocal microscope. These microscopes came out about fifteen years ago, and the software is also a bit old. It is very common to use older instruments, especially high end ones like this one, which can cost hundreds of thousands of dollars, in research.

One problem with the old software used for image acquisition on this microscope is that the file names are stored in the following format:

prefix_IMAGENAME_ch00.tif
prefix_SERIESNAME_t00_z000_ch00.tif

Here, prefix is chosed by the user. If we are taking a single image (not a time series of $z$-stack), then IMAGENAME is assigned by the software, since you may have multiple images (or series of images) with the same prefix. After ch are two digits indicating which channel is being used (which absorbance/emissions wavelengths are being used).

For series of images, the two digits after the t character indicate the frame number (i.e., time point). The digits after z indicate the z position, and the digits after ch are as in the single image example.

An inherent problem with this naming convention becomes clear when we have more than 100 images in a time series. For example, the 99th, 100th, and 101st images would be

prefix_SERIESNAME_t98_z000_ch00.tif
prefix_SERIESNAME_t99_z000_ch00.tif
prefix_SERIESNAME_t100_z000_ch00.tif

So, if we were to put all of our images in alphabetical order, which is how many image processing software packages read in images, the order would be

prefix_SERIESNAME_t00_z000_ch00.tif
prefix_SERIESNAME_t01_z000_ch00.tif
                 ⋮
prefix_SERIESNAME_t09_z000_ch00.tif
prefix_SERIESNAME_t100_z000_ch00.tif
prefix_SERIESNAME_t101_z000_ch00.tif
                 ⋮

The frames are now out of order! We should pad the frame number in the file name with more zeros.

We have a sample data set that you can download here. This data set contains an actual set of images I obtained from a Leica SP2. Our goals in this lesson are to: For this data set, prefix = stage9, refering to the stage of development of the cells in the images.

  1. Rename the files so that they can be read cleanly in alphabetical order.
  2. Parse the metadata (found in the file stage9.txt) to find the interpixel distance for these images.
  3. Parse the metadata to find the time points for each image.

We'll start with task 1.

Renaming the files

You may remember from our string tutorial that if we want leading zeros, we use the %04d format string for, e.g., an integer for a total of four digits.

In [3]:
print('This digit has {0:04d} digits.'.format(25))
This digit has 0025 digits.

To make sure we don't run into any problems, we'll rename the time points to have eight total digits. Our strategy is to get a list of all files in the directory. We'll then generate a list of names that the files should have. We'll then instruct the operating system, via the mv command, to rename the files.

Now, before scripting to rename, move, or otherwise process any of your data, always make a backup copy stached somewhere out of the way in case you make a mistake. Let me reiterate that.

Always have a backup of your data before parsing with scripts.

Getting a list of the directory contents: the os and glob modules

The os module from the standard library has lots of great tools for working with files and talking to the OS. We'll generate a list of files in our Leica directory, demonstrating some of the useful features of the os module.

In [4]:
# Specify Leica directory
leica_dir = '../data/leica_tiffs'

# Get a list of all files in the directory
file_list = os.listdir(leica_dir)

# Look at the list
file_list
Out[4]:
['stage9.txt',
 'stage9_Image003_ch00.tif',
 'stage9_Series006_t00_z000_ch00.tif',
 'stage9_Series006_t01_z000_ch00.tif',
 'stage9_Series006_t02_z000_ch00.tif',
 'stage9_Series006_t03_z000_ch00.tif',
 'stage9_Series006_t04_z000_ch00.tif',
 'stage9_Series006_t05_z000_ch00.tif',
 'stage9_Series006_t06_z000_ch00.tif',
 'stage9_Series006_t07_z000_ch00.tif',
 'stage9_Series006_t08_z000_ch00.tif',
 'stage9_Series006_t09_z000_ch00.tif',
 'stage9_Series006_t100_z000_ch00.tif',
 'stage9_Series006_t101_z000_ch00.tif',
 'stage9_Series006_t102_z000_ch00.tif',
 'stage9_Series006_t103_z000_ch00.tif',
 'stage9_Series006_t104_z000_ch00.tif',
 'stage9_Series006_t105_z000_ch00.tif',
 'stage9_Series006_t106_z000_ch00.tif',
 'stage9_Series006_t107_z000_ch00.tif',
 'stage9_Series006_t108_z000_ch00.tif',
 'stage9_Series006_t109_z000_ch00.tif',
 'stage9_Series006_t10_z000_ch00.tif',
 'stage9_Series006_t110_z000_ch00.tif',
 'stage9_Series006_t111_z000_ch00.tif',
 'stage9_Series006_t112_z000_ch00.tif',
 'stage9_Series006_t114_z000_ch00.tif',
 'stage9_Series006_t115_z000_ch00.tif',
 'stage9_Series006_t116_z000_ch00.tif',
 'stage9_Series006_t117_z000_ch00.tif',
 'stage9_Series006_t118_z000_ch00.tif',
 'stage9_Series006_t119_z000_ch00.tif',
 'stage9_Series006_t11_z000_ch00.tif',
 'stage9_Series006_t120_z000_ch00.tif',
 'stage9_Series006_t121_z000_ch00.tif',
 'stage9_Series006_t122_z000_ch00.tif',
 'stage9_Series006_t123_z000_ch00.tif',
 'stage9_Series006_t124_z000_ch00.tif',
 'stage9_Series006_t12_z000_ch00.tif',
 'stage9_Series006_t13_z000_ch00.tif',
 'stage9_Series006_t14_z000_ch00.tif',
 'stage9_Series006_t15_z000_ch00.tif',
 'stage9_Series006_t16_z000_ch00.tif',
 'stage9_Series006_t17_z000_ch00.tif',
 'stage9_Series006_t18_z000_ch00.tif',
 'stage9_Series006_t19_z000_ch00.tif',
 'stage9_Series006_t20_z000_ch00.tif',
 'stage9_Series006_t21_z000_ch00.tif',
 'stage9_Series006_t22_z000_ch00.tif',
 'stage9_Series006_t23_z000_ch00.tif',
 'stage9_Series006_t24_z000_ch00.tif',
 'stage9_Series006_t25_z000_ch00.tif',
 'stage9_Series006_t26_z000_ch00.tif',
 'stage9_Series006_t27_z000_ch00.tif',
 'stage9_Series006_t28_z000_ch00.tif',
 'stage9_Series006_t29_z000_ch00.tif',
 'stage9_Series006_t30_z000_ch00.tif',
 'stage9_Series006_t31_z000_ch00.tif',
 'stage9_Series006_t32_z000_ch00.tif',
 'stage9_Series006_t34_z000_ch00.tif',
 'stage9_Series006_t35_z000_ch00.tif',
 'stage9_Series006_t36_z000_ch00.tif',
 'stage9_Series006_t37_z000_ch00.tif',
 'stage9_Series006_t38_z000_ch00.tif',
 'stage9_Series006_t39_z000_ch00.tif',
 'stage9_Series006_t40_z000_ch00.tif',
 'stage9_Series006_t41_z000_ch00.tif',
 'stage9_Series006_t42_z000_ch00.tif',
 'stage9_Series006_t43_z000_ch00.tif',
 'stage9_Series006_t44_z000_ch00.tif',
 'stage9_Series006_t45_z000_ch00.tif',
 'stage9_Series006_t46_z000_ch00.tif',
 'stage9_Series006_t47_z000_ch00.tif',
 'stage9_Series006_t48_z000_ch00.tif',
 'stage9_Series006_t49_z000_ch00.tif',
 'stage9_Series006_t50_z000_ch00.tif',
 'stage9_Series006_t51_z000_ch00.tif',
 'stage9_Series006_t52_z000_ch00.tif',
 'stage9_Series006_t53_z000_ch00.tif',
 'stage9_Series006_t54_z000_ch00.tif',
 'stage9_Series006_t55_z000_ch00.tif',
 'stage9_Series006_t56_z000_ch00.tif',
 'stage9_Series006_t57_z000_ch00.tif',
 'stage9_Series006_t58_z000_ch00.tif',
 'stage9_Series006_t59_z000_ch00.tif',
 'stage9_Series006_t60_z000_ch00.tif',
 'stage9_Series006_t61_z000_ch00.tif',
 'stage9_Series006_t62_z000_ch00.tif',
 'stage9_Series006_t63_z000_ch00.tif',
 'stage9_Series006_t64_z000_ch00.tif',
 'stage9_Series006_t65_z000_ch00.tif',
 'stage9_Series006_t66_z000_ch00.tif',
 'stage9_Series006_t67_z000_ch00.tif',
 'stage9_Series006_t68_z000_ch00.tif',
 'stage9_Series006_t69_z000_ch00.tif',
 'stage9_Series006_t70_z000_ch00.tif',
 'stage9_Series006_t71_z000_ch00.tif',
 'stage9_Series006_t72_z000_ch00.tif',
 'stage9_Series006_t73_z000_ch00.tif',
 'stage9_Series006_t74_z000_ch00.tif',
 'stage9_Series006_t75_z000_ch00.tif',
 'stage9_Series006_t76_z000_ch00.tif',
 'stage9_Series006_t77_z000_ch00.tif',
 'stage9_Series006_t78_z000_ch00.tif',
 'stage9_Series006_t79_z000_ch00.tif',
 'stage9_Series006_t80_z000_ch00.tif',
 'stage9_Series006_t81_z000_ch00.tif',
 'stage9_Series006_t82_z000_ch00.tif',
 'stage9_Series006_t83_z000_ch00.tif',
 'stage9_Series006_t84_z000_ch00.tif',
 'stage9_Series006_t85_z000_ch00.tif',
 'stage9_Series006_t86_z000_ch00.tif',
 'stage9_Series006_t87_z000_ch00.tif',
 'stage9_Series006_t88_z000_ch00.tif',
 'stage9_Series006_t89_z000_ch00.tif',
 'stage9_Series006_t90_z000_ch00.tif',
 'stage9_Series006_t91_z000_ch00.tif',
 'stage9_Series006_t92_z000_ch00.tif',
 'stage9_Series006_t93_z000_ch00.tif',
 'stage9_Series006_t94_z000_ch00.tif',
 'stage9_Series006_t95_z000_ch00.tif',
 'stage9_Series006_t96_z000_ch00.tif',
 'stage9_Series006_t97_z000_ch00.tif',
 'stage9_Series006_t98_z000_ch00.tif',
 'stage9_Series006_t99_z000_ch00.tif']

The os.listdir() function lists all of the files in a directory (like the ls function) and stores them as a list. We see we have our metadata files, stage9.txt, a single image, stage9_Image003_ch00.tif, and then a lot of files in a time series (called Series006) that appears to go from frame 00 to frame 124. It is these times series files that we want to remame.

While we have a list of the files, we would like a list of the full path of the files. We can use the os.path.join() method to conveniently do that. It joins the strings of the paths, putting /'s where appropriate.

In [5]:
# Get list of all files (full path) we want to rename
rename_list = []
for fname in file_list:
    if '_t' in fname:
        rename_list.append(os.path.join(leica_dir, fname))
        
# Let's look at the list
rename_list
Out[5]:
['../data/leica_tiffs/stage9_Series006_t00_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t01_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t02_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t03_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t04_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t05_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t06_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t07_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t08_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t09_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t100_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t101_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t102_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t103_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t104_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t105_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t106_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t107_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t108_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t109_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t10_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t110_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t111_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t112_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t114_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t115_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t116_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t117_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t118_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t119_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t11_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t120_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t121_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t122_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t123_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t124_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t12_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t13_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t14_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t15_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t16_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t17_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t18_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t19_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t20_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t21_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t22_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t23_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t24_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t25_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t26_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t27_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t28_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t29_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t30_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t31_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t32_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t34_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t35_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t36_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t37_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t38_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t39_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t40_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t41_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t42_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t43_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t44_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t45_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t46_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t47_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t48_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t49_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t50_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t51_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t52_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t53_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t54_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t55_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t56_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t57_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t58_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t59_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t60_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t61_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t62_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t63_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t64_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t65_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t66_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t67_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t68_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t69_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t70_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t71_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t72_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t73_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t74_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t75_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t76_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t77_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t78_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t79_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t80_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t81_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t82_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t83_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t84_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t85_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t86_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t87_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t88_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t89_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t90_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t91_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t92_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t93_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t94_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t95_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t96_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t97_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t98_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t99_z000_ch00.tif']

Now, we have a list of only the files we want to rename. We could have done this more concisely using the glob module, which allows generating lists of files with wild card characters. (The name glob has an archaic history.)

In [6]:
# String to look for
search_str = os.path.join(leica_dir, '*_t*.tif')

# Get all files that match search_str
rename_list = glob.glob(search_str)

rename_list
Out[6]:
['../data/leica_tiffs/stage9_Series006_t00_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t01_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t02_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t03_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t04_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t05_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t06_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t07_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t08_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t09_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t100_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t101_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t102_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t103_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t104_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t105_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t106_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t107_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t108_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t109_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t10_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t110_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t111_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t112_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t114_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t115_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t116_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t117_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t118_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t119_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t11_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t120_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t121_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t122_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t123_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t124_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t12_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t13_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t14_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t15_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t16_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t17_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t18_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t19_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t20_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t21_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t22_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t23_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t24_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t25_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t26_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t27_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t28_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t29_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t30_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t31_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t32_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t34_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t35_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t36_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t37_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t38_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t39_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t40_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t41_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t42_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t43_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t44_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t45_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t46_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t47_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t48_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t49_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t50_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t51_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t52_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t53_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t54_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t55_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t56_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t57_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t58_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t59_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t60_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t61_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t62_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t63_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t64_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t65_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t66_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t67_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t68_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t69_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t70_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t71_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t72_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t73_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t74_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t75_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t76_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t77_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t78_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t79_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t80_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t81_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t82_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t83_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t84_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t85_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t86_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t87_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t88_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t89_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t90_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t91_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t92_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t93_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t94_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t95_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t96_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t97_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t98_z000_ch00.tif',
 '../data/leica_tiffs/stage9_Series006_t99_z000_ch00.tif']

Renaming files using the shutil module

We will now rename the files. There are several ways to do this using modules that are part of the standard library. Here, we'll show how to do it using the shutil module. This module contains several functions for file management.

In [7]:
# String to search for (the time stamp)
regex = re.compile('_t\d+_')

# Rename the files to have 8 total digits in the time field
for fname in rename_list:
    # Pull out _t00_ string
    time_str = regex.search(fname).group()

    # Make a new, 8-digit string
    new_time_str = '_t{0:08d}_'.format(int(time_str[2:-1]))
    
    # Make a new file name; could do: new_fname = regex.sub(new_time_str, fname)
    new_fname = fname.replace(time_str, new_time_str)
    
    # Make a call to shutil.move to rename files
    shutil.move(fname, new_fname)

You can look in your directory now to see all of your beautifully renamed files!

The subprocess module

The subprocess module of the standard library is useful for running commands on the command line. This is often used in bioinformatics pipelines, for example, when you need to run various programs from the command line to process data sets. We don't really need it for the bootcamp, but I bring it up here because it may come in handy for you down the road. If we wanted to do the same operations as above, except with the subprocess module, we would replace the shutil.move(fname, new_fname) with

subprocess.call(['mv', fname, new_fname])

The subprocess.call() function allows you to execute command line commands as subprocesses, also allowing parallel processing. The first argument is a comma separated list of the strings you would enter in the command line to do your operation. For example, if fname = 'file1' and new_fname = 'file2', the above is the same as entering

mv file1 file2

on the command line.

As an example of how the subprocess.call() function works, we'll list the contents of the Leica directory and redirect the output to a file lieca_dir_contents.txt.

In [8]:
with open('leica_dir_contents.txt', 'w') as f:
    subprocess.call(['ls', '-l', leica_dir], stdout=f)

Looking at leica_dir_contents.txt, we see that it did what we expected.

In [9]:
!head leica_dir_contents.txt
total 29640
-rw-r--r--+ 1 Justin  staff    13457 Sep 10 13:45 stage9.txt
-rw-r--r--+ 1 Justin  staff  1051510 Sep 10 13:45 stage9_Image003_ch00.tif
-rw-r--r--+ 1 Justin  staff   113062 Sep 10 13:45 stage9_Series006_t00000000_z000_ch00.tif
-rw-r--r--+ 1 Justin  staff   113062 Sep 10 13:45 stage9_Series006_t00000001_z000_ch00.tif
-rw-r--r--+ 1 Justin  staff   113062 Sep 10 13:45 stage9_Series006_t00000002_z000_ch00.tif
-rw-r--r--+ 1 Justin  staff   113062 Sep 10 13:45 stage9_Series006_t00000003_z000_ch00.tif
-rw-r--r--+ 1 Justin  staff   113062 Sep 10 13:45 stage9_Series006_t00000004_z000_ch00.tif
-rw-r--r--+ 1 Justin  staff   113062 Sep 10 13:45 stage9_Series006_t00000005_z000_ch00.tif
-rw-r--r--+ 1 Justin  staff   113062 Sep 10 13:45 stage9_Series006_t00000006_z000_ch00.tif

Parsing the metadata

Let's look at the metadata file that came with our image series.

In [10]:
!head ../data/leica_tiffs/stage9.txt









You should look at this with whatever text editor your like so you can see all of the contents. It is a text file with lots of information about the images that were acquired. Specifically, we want the information about Scan006. So, we want to scan the file until we get to the line that says

Series Name:    Series006

The entries immediately after that will give the information about this series. First, let's just read in all of the lines of the file.

In [11]:
with open(os.path.join(leica_dir, 'stage9.txt'), 'r') as f:
    lines = f.readlines()
---------------------------------------------------------------------------
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-11-c70030b86998> in <module>()
      1 with open(os.path.join(leica_dir, 'stage9.txt'), 'r') as f:
----> 2     lines = f.readlines()

/Users/Justin/anaconda/lib/python3.4/codecs.py in decode(self, input, final)
    317         # decode input (taking the buffer into account)
    318         data = self.buffer + input
--> 319         (result, consumed) = self._buffer_decode(data, self.errors, final)
    320         # keep undecoded input until the next call
    321         self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 3378: invalid start byte

Yikes! What is going on here? It turns out that these Leica files are not UTF-8 encoded, which means there are strange characters in there that we cannot recognize. The encoding is ISO-8859-15, a pre UTF-8 encoding. Very annoying. We just have to specify the encoding when we open the file to be able to read it.

In [12]:
with open(os.path.join(leica_dir, 'stage9.txt'), 'r', 
          encoding='ISO-8859-15') as f:
    lines = f.readlines()

Ok, that worked. Let's take a look at the file lines we pulled out.

In [13]:
lines
Out[13]:
['Leica Microsystems Heidelberg GmbH\x00\n',
 'This file is intended for read-only purposes changes here will not affect the images.\x00\n',
 'Date:\t Thursday, May 30, 2013\x00\n',
 'Time:\t 17:38\x00\n',
 '\x00\n',
 'File Version:\t 26000000\x00\n',
 '\x00\n',
 'EXPERIMENT INFORMATION\x00\n',
 'Number of Images: 2\x00\n',
 "Type:          \tSeries with 'tif'-files \x00\n",
 '\x00\n',
 'DIMENSION DESCRIPTION #0\x00\n',
 'Pixel Size in Byte:  \t1\x00\n',
 'Resolution in Bit:   \t8\x00\n',
 'Max Value:           \t      255.0000000000\x00\n',
 'Min Value:           \t0.000000e+000\x00\n',
 'Label:               \tI\x00\n',
 'Number of Dimensions:\t3\x00\n',
 'Dimension_0:    \t120\x00\n',
 'Logical Size:    \t512\x00\n',
 'Physical Length: \t2.501221e-004 m\x00\n',
 'Physical Origin: \t0.000000e+000 m\x00\n',
 'Dimension_1:    \t121\x00\n',
 'Logical Size:    \t512\x00\n',
 'Physical Length: \t2.501221e-004 m\x00\n',
 'Physical Origin: \t0.000000e+000 m\x00\n',
 'Dimension_2:    \t6815843\x00\n',
 'Logical Size:    \t1\x00\n',
 'Physical Length: \t0.000000e+000\x00\n',
 'Physical Origin: \t0.000000e+000\x00\n',
 'Series Name:\tImage003\x00\n',
 'Description:\t\x00\n',
 '\x00\n',
 'HARDWARE PARAMETER #0\x00\n',
 'AOBS (0)\t\t100.000000\x00\n',
 'AOBS (1)\t\t100.000000\x00\n',
 'AOBS (2)\t\t0.000000\x00\n',
 'AOBS (3)\t\t0.000000\x00\n',
 'AOBS (4)\t\t0.000000\x00\n',
 'AOBS (5)\t\t0.000000\x00\n',
 'AOBS (6)\t\t0.000000\x00\n',
 'AOBS (7)\t\t0.000000\x00\n',
 'AOBS (0)\t\t0.000000\x00\n',
 'AOBS (1)\t\t0.000000\x00\n',
 'AOBS (2)\t\t0.000000\x00\n',
 'AOBS (3)\t\t0.000000\x00\n',
 'AOBS (4)\t\t0.000000\x00\n',
 'AOBS (5)\t\t0.000000\x00\n',
 'AOBS (6)\t\t0.000000\x00\n',
 'AOBS (7)\t\t0.000000\x00\n',
 'AOTF (0)\t\t0.000000\x00\n',
 'AOTF (1)\t\t0.000000\x00\n',
 'AOTF (0)\t\t0.000000\x00\n',
 'AOTF (1)\t\t0.000000\x00\n',
 'AOTF (458)\t\t0.000000\x00\n',
 'AOTF (476)\t\t0.000000\x00\n',
 'AOTF (488)\t\t42.173382\x00\n',
 'AOTF (496)\t\t0.000000\x00\n',
 'AOTF (514)\t\t0.000000\x00\n',
 'AOTF (543)\t\t0.000000\x00\n',
 'AOTF (594)\t\t0.000000\x00\n',
 'AOTF (633)\t\t0.000000\x00\n',
 'AOTF (458)\t\t0.000000\x00\n',
 'AOTF (476)\t\t0.000000\x00\n',
 'AOTF (488)\t\t0.000000\x00\n',
 'AOTF (496)\t\t0.000000\x00\n',
 'AOTF (514)\t\t0.000000\x00\n',
 'AOTF (543)\t\t0.000000\x00\n',
 'AOTF (594)\t\t0.000000\x00\n',
 'AOTF (633)\t\t0.000000\x00\n',
 'PMT 1\tInactive\tInactive\x00\n',
 'PMT 2\tInactive\tInactive\x00\n',
 'PMT 3\tInactive\tInactive\x00\n',
 'PMT 4\tInactive\tInactive\x00\n',
 'PMT NDD1\tInactive\tInactive\x00\n',
 'PMT NDD2\tInactive\tInactive\x00\n',
 'PMT NDD3\tActive\tActive\x00\n',
 'PMT NDD3 (Offs.)\t\t0.000000\x00\n',
 'PMT NDD3 (HV)\t\t250.000000\x00\n',
 'Beam Expander\tBeam Exp 6\tBeam Exp 6\x00\n',
 'External Detection FW\tMirror \tMirror \x00\n',
 'External Detection FW\tMirror \tMirror \x00\n',
 'Polarization FW\t--- \t--- \x00\n',
 'Polarization FW\tEmpty 3 \tEmpty 3 \x00\n',
 'Hardware Type No.\t\t2.000000\x00\n',
 'Laser wavelength\t\t780\x00\n',
 'Scan Field Rotation\t\t-0.038943\x00\n',
 'Rotation Direction\t\t1\x00\n',
 'X Scan Actuator\tActive\tActive\x00\n',
 'X Scan Actuator (Gain)\t\t2.998536\x00\n',
 'X Scan Actuator (Offs.)\t\t0.000000\x00\n',
 'Y Scan Actuator\tActive\tActive\x00\n',
 'Y Scan Actuator (Gain)\t\t2.998536\x00\n',
 'Y Scan Actuator (Offs.)\t\t0.000000\x00\n',
 'Z Scan Actuator\tInactive\tInactive\x00\n',
 'Z Scan Actuator (POS)\t\t0.000000\x00\n',
 'Scan Speed\t\t1000.000000\x00\n',
 'Phase\t\t64.778646\x00\n',
 'Y-Phase\t\t0.122100\x00\n',
 'SP Mirror 1 (left)\t\t400.000000\x00\n',
 'SP Mirror 1 (right)\t\t450.000000\x00\n',
 'SP Mirror 1 (stain)\tNone\tNone\x00\n',
 'SP Mirror 2 (left)\t\t500.000000\x00\n',
 'SP Mirror 2 (right)\t\t535.000000\x00\n',
 'SP Mirror 2 (stain)\tNone\tNone\x00\n',
 'SP Mirror 3 (left)\t\t555.000000\x00\n',
 'SP Mirror 3 (right)\t\t620.000000\x00\n',
 'SP Mirror 3 (stain)\tNone\tNone\x00\n',
 'SP Mirror 4 (left)\t\t650.000000\x00\n',
 'SP Mirror 4 (right)\t\t750.000000\x00\n',
 'SP Mirror 4 (stain)\tNone\tNone\x00\n',
 'Objective\tHC PL APO CS  20.0x0.70 UV\tHC PL APO CS  20.0x0.70 UV\x00\n',
 'Order number (Obj.)\t\t506513\x00\n',
 'Numerical aperture (Obj.)\t\t0.700000\x00\n',
 '\x00\n',
 'SCANNER INFORMATION #0\x00\n',
 'RoiScan\t\t0\x00\n',
 'IsSequential\t\t0\x00\n',
 'ChaserUVShutter\t\t0\x00\n',
 'ChaserVisibleShutter\t\t0\x00\n',
 'MPShutter\t\t0\x00\n',
 'UVShutter\t\t0\x00\n',
 'VisibleShutter\t\t1\x00\n',
 'ScanMode\txyzt\tInactive\x00\n',
 'Pinhole [m]\t\t0.000055\x00\n',
 'Pinhole [airy]\t\t0.749278\x00\n',
 'Size-Width\t[µm]\t\t250.122070\x00\n',
 'Size-Height\t[µm]\t\t250.122070\x00\n',
 'Size-Depth\t\t0.000000\x00\n',
 'StepSize\t[µm]\t\t0.040703\x00\n',
 'Voxel-Width\t[µm]\t\t0.488520\x00\n',
 'Voxel-Height\t[µm]\t\t0.488520\x00\n',
 'Voxel-Depth\t\t0.000000\x00\n',
 'Zoom\t\t2.998536\x00\n',
 'Scan-Direction\t\t1\x00\n',
 'Y-Scan-Direction\t\t1\x00\n',
 'SequentialMode\t\t0\x00\n',
 'Frame-Accumulation\t\t1\x00\n',
 'Frame-Average\t\t1\x00\n',
 'Line-Average\t\t1\x00\n',
 'Resolution\t\t8\x00\n',
 'Channels\t\t1\x00\n',
 'Delay[ms]\t[ms]\t\t657\x00\n',
 'Format-Width\t\t512\x00\n',
 'Format-Height\t\t512\x00\n',
 'Iterations\t\t125\x00\n',
 'Sections\t\t1\x00\n',
 '\x00\n',
 'TIME INFORMATION #0\x00\n',
 'Stamped Dimension:\t2\x00\n',
 'Stamp_0:    \t2013:05:30,16:45:49:140\x00\n',
 '\x00\n',
 'LUT DESCRIPTION #0\x00\n',
 'LUT_0\x00\n',
 'Name:                   \tGray\x00\n',
 'Inverted (1=yes / 0=no):\t0\x00\n',
 '\x00\n',
 'SEQUENTIAL INFORMATION #0\x00\n',
 'Sequence Count: \t0\x00\n',
 '\x00\n',
 'SERIES INFORMATION #0\x00\n',
 'Number of Series:\t 2\x00\n',
 '\x00\n',
 'IMAGES INFORMATION #0\x00\n',
 'Number of Images: \t1\x00\n',
 'Image Width:      \t512\x00\n',
 'Iamge Length:     \t512\x00\n',
 'Bits per Sample:  \t8\x00\n',
 'Samples per Pixel:\t1\x00\n',
 '\x00\n',
 '*************************************** NEXT IMAGE *********************************\x00\n',
 'DIMENSION DESCRIPTION #1\x00\n',
 'Pixel Size in Byte:  \t1\x00\n',
 'Resolution in Bit:   \t8\x00\n',
 'Max Value:           \t      255.0000000000\x00\n',
 'Min Value:           \t0.000000e+000\x00\n',
 'Label:               \tI\x00\n',
 'Number of Dimensions:\t5\x00\n',
 'Dimension_0:    \t120\x00\n',
 'Logical Size:    \t256\x00\n',
 'Physical Length: \t7.940383e-005 m\x00\n',
 'Physical Origin: \t0.000000e+000 m\x00\n',
 'Dimension_1:    \t121\x00\n',
 'Logical Size:    \t256\x00\n',
 'Physical Length: \t7.940383e-005 m\x00\n',
 'Physical Origin: \t0.000000e+000 m\x00\n',
 'Dimension_2:    \t6815843\x00\n',
 'Logical Size:    \t1\x00\n',
 'Physical Length: \t0.000000e+000\x00\n',
 'Physical Origin: \t0.000000e+000\x00\n',
 'Dimension_3:    \t122\x00\n',
 'Logical Size:    \t1\x00\n',
 'Physical Length: \t0.000000e+000\x00\n',
 'Physical Origin: \t0.000000e+000\x00\n',
 'Dimension_4:    \t116\x00\n',
 'Logical Size:    \t125\x00\n',
 'Physical Length: \t50.00 s\x00\n',
 'Physical Origin: \t0.000000e+000 s\x00\n',
 'Series Name:\tSeries006\x00\n',
 'Description:\t\x00\n',
 '\x00\n',
 'HARDWARE PARAMETER #1\x00\n',
 'AOBS (0)\t\t100.000000\x00\n',
 'AOBS (1)\t\t100.000000\x00\n',
 'AOBS (2)\t\t0.000000\x00\n',
 'AOBS (3)\t\t0.000000\x00\n',
 'AOBS (4)\t\t0.000000\x00\n',
 'AOBS (5)\t\t0.000000\x00\n',
 'AOBS (6)\t\t0.000000\x00\n',
 'AOBS (7)\t\t0.000000\x00\n',
 'AOBS (0)\t\t0.000000\x00\n',
 'AOBS (1)\t\t0.000000\x00\n',
 'AOBS (2)\t\t0.000000\x00\n',
 'AOBS (3)\t\t0.000000\x00\n',
 'AOBS (4)\t\t0.000000\x00\n',
 'AOBS (5)\t\t0.000000\x00\n',
 'AOBS (6)\t\t0.000000\x00\n',
 'AOBS (7)\t\t0.000000\x00\n',
 'AOTF (0)\t\t0.000000\x00\n',
 'AOTF (1)\t\t0.000000\x00\n',
 'AOTF (0)\t\t0.000000\x00\n',
 'AOTF (1)\t\t0.000000\x00\n',
 'AOTF (458)\t\t0.000000\x00\n',
 'AOTF (476)\t\t0.000000\x00\n',
 'AOTF (488)\t\t0.000000\x00\n',
 'AOTF (496)\t\t0.000000\x00\n',
 'AOTF (514)\t\t0.000000\x00\n',
 'AOTF (543)\t\t0.000000\x00\n',
 'AOTF (594)\t\t33.724054\x00\n',
 'AOTF (633)\t\t0.000000\x00\n',
 'AOTF (458)\t\t0.000000\x00\n',
 'AOTF (476)\t\t0.000000\x00\n',
 'AOTF (488)\t\t0.000000\x00\n',
 'AOTF (496)\t\t0.000000\x00\n',
 'AOTF (514)\t\t0.000000\x00\n',
 'AOTF (543)\t\t0.000000\x00\n',
 'AOTF (594)\t\t0.000000\x00\n',
 'AOTF (633)\t\t0.000000\x00\n',
 'PMT 1\tInactive\tInactive\x00\n',
 'PMT 2\tInactive\tInactive\x00\n',
 'PMT 3\tActive\tActive\x00\n',
 'PMT 3 (Offs.)\t\t0.000000\x00\n',
 'PMT 3 (HV)\t\t500.000000\x00\n',
 'PMT 4\tInactive\tInactive\x00\n',
 'PMT NDD1\tInactive\tInactive\x00\n',
 'PMT NDD2\tInactive\tInactive\x00\n',
 'PMT NDD3\tInactive\tInactive\x00\n',
 'Beam Expander\tBeam Exp 6\tBeam Exp 6\x00\n',
 'External Detection FW\tMirror \tMirror \x00\n',
 'External Detection FW\tMirror \tMirror \x00\n',
 'Polarization FW\tEmpty1 \tEmpty1 \x00\n',
 'Polarization FW\tEmpty1 \tEmpty1 \x00\n',
 'Hardware Type No.\t\t2.000000\x00\n',
 'Laser wavelength\t\t780\x00\n',
 'Scan Field Rotation\t\t-0.038943\x00\n',
 'Rotation Direction\t\t1\x00\n',
 'X Scan Actuator\tActive\tActive\x00\n',
 'X Scan Actuator (Gain)\t\t2.998536\x00\n',
 'X Scan Actuator (Offs.)\t\t0.000000\x00\n',
 'Y Scan Actuator\tActive\tActive\x00\n',
 'Y Scan Actuator (Gain)\t\t2.998536\x00\n',
 'Y Scan Actuator (Offs.)\t\t0.000000\x00\n',
 'Z Scan Actuator\tInactive\tInactive\x00\n',
 'Z Scan Actuator (POS)\t\t0.000000\x00\n',
 'Scan Speed\t\t1000.000000\x00\n',
 'Phase\t\t64.778646\x00\n',
 'Y-Phase\t\t0.122100\x00\n',
 'SP Mirror 1 (left)\t\t350.000000\x00\n',
 'SP Mirror 1 (right)\t\t400.000000\x00\n',
 'SP Mirror 1 (stain)\tNone\tNone\x00\n',
 'SP Mirror 2 (left)\t\t500.000000\x00\n',
 'SP Mirror 2 (right)\t\t535.000000\x00\n',
 'SP Mirror 2 (stain)\tNone\tNone\x00\n',
 'SP Mirror 3 (left)\t\t605.000000\x00\n',
 'SP Mirror 3 (right)\t\t700.000000\x00\n',
 'SP Mirror 3 (stain)\tTEXAS RED\tTEXAS RED\x00\n',
 'SP Mirror 4 (left)\t\t800.000000\x00\n',
 'SP Mirror 4 (right)\t\t850.000000\x00\n',
 'SP Mirror 4 (stain)\tNone\tNone\x00\n',
 'Objective\tHCX PL APO CS  63.0x1.20 IMM BD\tHCX PL APO CS  63.0x1.20 IMM BD\x00\n',
 'Order number (Obj.)\t\t506212\x00\n',
 'Numerical aperture (Obj.)\t\t1.200000\x00\n',
 '\x00\n',
 'SCANNER INFORMATION #1\x00\n',
 'RoiScan\t\t0\x00\n',
 'IsSequential\t\t0\x00\n',
 'ChaserUVShutter\t\t0\x00\n',
 'ChaserVisibleShutter\t\t0\x00\n',
 'MPShutter\t\t0\x00\n',
 'UVShutter\t\t0\x00\n',
 'VisibleShutter\t\t1\x00\n',
 'ScanMode\txyzt\tInactive\x00\n',
 'Pinhole [m]\t\t0.000134\x00\n',
 'Pinhole [airy]\t\t0.998881\x00\n',
 'Size-Width\t[µm]\t\t79.403832\x00\n',
 'Size-Height\t[µm]\t\t79.403832\x00\n',
 'Size-Depth\t\t0.000000\x00\n',
 'StepSize\t[µm]\t\t0.040703\x00\n',
 'Voxel-Width\t[µm]\t\t0.310171\x00\n',
 'Voxel-Height\t[µm]\t\t0.310171\x00\n',
 'Voxel-Depth\t\t0.000000\x00\n',
 'Zoom\t\t2.998536\x00\n',
 'Scan-Direction\t\t1\x00\n',
 'Y-Scan-Direction\t\t1\x00\n',
 'SequentialMode\t\t0\x00\n',
 'Frame-Accumulation\t\t1\x00\n',
 'Frame-Average\t\t1\x00\n',
 'Line-Average\t\t1\x00\n',
 'Resolution\t\t8\x00\n',
 'Channels\t\t1\x00\n',
 'Delay[ms]\t[ms]\t\t401\x00\n',
 'Format-Width\t\t256\x00\n',
 'Format-Height\t\t256\x00\n',
 'Iterations\t\t125\x00\n',
 'Sections\t\t1\x00\n',
 '\x00\n',
 'TIME INFORMATION #1\x00\n',
 'Stamped Dimension:\t2\x00\n',
 'Stamp_0:    \t2013:05:30,16:48:30:515\x00\n',
 'Stamp_1:    \t2013:05:30,16:48:30:921\x00\n',
 'Stamp_2:    \t2013:05:30,16:48:31:312\x00\n',
 'Stamp_3:    \t2013:05:30,16:48:31:718\x00\n',
 'Stamp_4:    \t2013:05:30,16:48:32:109\x00\n',
 'Stamp_5:    \t2013:05:30,16:48:32:515\x00\n',
 'Stamp_6:    \t2013:05:30,16:48:32:921\x00\n',
 'Stamp_7:    \t2013:05:30,16:48:33:312\x00\n',
 'Stamp_8:    \t2013:05:30,16:48:33:718\x00\n',
 'Stamp_9:    \t2013:05:30,16:48:34:125\x00\n',
 'Stamp_10:    \t2013:05:30,16:48:34:515\x00\n',
 'Stamp_11:    \t2013:05:30,16:48:34:921\x00\n',
 'Stamp_12:    \t2013:05:30,16:48:35:328\x00\n',
 'Stamp_13:    \t2013:05:30,16:48:35:718\x00\n',
 'Stamp_14:    \t2013:05:30,16:48:36:125\x00\n',
 'Stamp_15:    \t2013:05:30,16:48:36:531\x00\n',
 'Stamp_16:    \t2013:05:30,16:48:36:921\x00\n',
 'Stamp_17:    \t2013:05:30,16:48:37:328\x00\n',
 'Stamp_18:    \t2013:05:30,16:48:37:734\x00\n',
 'Stamp_19:    \t2013:05:30,16:48:38:125\x00\n',
 'Stamp_20:    \t2013:05:30,16:48:38:531\x00\n',
 'Stamp_21:    \t2013:05:30,16:48:38:937\x00\n',
 'Stamp_22:    \t2013:05:30,16:48:39:328\x00\n',
 'Stamp_23:    \t2013:05:30,16:48:39:734\x00\n',
 'Stamp_24:    \t2013:05:30,16:48:40:125\x00\n',
 'Stamp_25:    \t2013:05:30,16:48:40:531\x00\n',
 'Stamp_26:    \t2013:05:30,16:48:40:937\x00\n',
 'Stamp_27:    \t2013:05:30,16:48:41:328\x00\n',
 'Stamp_28:    \t2013:05:30,16:48:41:750\x00\n',
 'Stamp_29:    \t2013:05:30,16:48:42:140\x00\n',
 'Stamp_30:    \t2013:05:30,16:48:42:531\x00\n',
 'Stamp_31:    \t2013:05:30,16:48:42:937\x00\n',
 'Stamp_32:    \t2013:05:30,16:48:43:343\x00\n',
 'Stamp_33:    \t2013:05:30,16:48:43:734\x00\n',
 'Stamp_34:    \t2013:05:30,16:48:44:140\x00\n',
 'Stamp_35:    \t2013:05:30,16:48:44:546\x00\n',
 'Stamp_36:    \t2013:05:30,16:48:44:937\x00\n',
 'Stamp_37:    \t2013:05:30,16:48:45:343\x00\n',
 'Stamp_38:    \t2013:05:30,16:48:45:750\x00\n',
 'Stamp_39:    \t2013:05:30,16:48:46:140\x00\n',
 'Stamp_40:    \t2013:05:30,16:48:46:546\x00\n',
 'Stamp_41:    \t2013:05:30,16:48:46:953\x00\n',
 'Stamp_42:    \t2013:05:30,16:48:47:343\x00\n',
 'Stamp_43:    \t2013:05:30,16:48:47:750\x00\n',
 'Stamp_44:    \t2013:05:30,16:48:48:156\x00\n',
 'Stamp_45:    \t2013:05:30,16:48:48:546\x00\n',
 'Stamp_46:    \t2013:05:30,16:48:48:953\x00\n',
 'Stamp_47:    \t2013:05:30,16:48:49:359\x00\n',
 'Stamp_48:    \t2013:05:30,16:48:49:750\x00\n',
 'Stamp_49:    \t2013:05:30,16:48:50:156\x00\n',
 'Stamp_50:    \t2013:05:30,16:48:50:562\x00\n',
 'Stamp_51:    \t2013:05:30,16:48:50:953\x00\n',
 'Stamp_52:    \t2013:05:30,16:48:51:359\x00\n',
 'Stamp_53:    \t2013:05:30,16:48:51:765\x00\n',
 'Stamp_54:    \t2013:05:30,16:48:52:156\x00\n',
 'Stamp_55:    \t2013:05:30,16:48:52:562\x00\n',
 'Stamp_56:    \t2013:05:30,16:48:52:968\x00\n',
 'Stamp_57:    \t2013:05:30,16:48:53:359\x00\n',
 'Stamp_58:    \t2013:05:30,16:48:53:765\x00\n',
 'Stamp_59:    \t2013:05:30,16:48:54:171\x00\n',
 'Stamp_60:    \t2013:05:30,16:48:54:562\x00\n',
 'Stamp_61:    \t2013:05:30,16:48:54:968\x00\n',
 'Stamp_62:    \t2013:05:30,16:48:55:375\x00\n',
 'Stamp_63:    \t2013:05:30,16:48:55:765\x00\n',
 'Stamp_64:    \t2013:05:30,16:48:56:171\x00\n',
 'Stamp_65:    \t2013:05:30,16:48:56:578\x00\n',
 'Stamp_66:    \t2013:05:30,16:48:56:968\x00\n',
 'Stamp_67:    \t2013:05:30,16:48:57:375\x00\n',
 'Stamp_68:    \t2013:05:30,16:48:57:781\x00\n',
 'Stamp_69:    \t2013:05:30,16:48:58:171\x00\n',
 'Stamp_70:    \t2013:05:30,16:48:58:578\x00\n',
 'Stamp_71:    \t2013:05:30,16:48:58:984\x00\n',
 'Stamp_72:    \t2013:05:30,16:48:59:375\x00\n',
 'Stamp_73:    \t2013:05:30,16:48:59:781\x00\n',
 'Stamp_74:    \t2013:05:30,16:49:00:187\x00\n',
 'Stamp_75:    \t2013:05:30,16:49:00:578\x00\n',
 'Stamp_76:    \t2013:05:30,16:49:00:984\x00\n',
 'Stamp_77:    \t2013:05:30,16:49:01:390\x00\n',
 'Stamp_78:    \t2013:05:30,16:49:01:781\x00\n',
 'Stamp_79:    \t2013:05:30,16:49:02:187\x00\n',
 'Stamp_80:    \t2013:05:30,16:49:02:593\x00\n',
 'Stamp_81:    \t2013:05:30,16:49:02:984\x00\n',
 'Stamp_82:    \t2013:05:30,16:49:03:390\x00\n',
 'Stamp_83:    \t2013:05:30,16:49:03:796\x00\n',
 'Stamp_84:    \t2013:05:30,16:49:04:187\x00\n',
 'Stamp_85:    \t2013:05:30,16:49:04:593\x00\n',
 'Stamp_86:    \t2013:05:30,16:49:05:00\x00\n',
 'Stamp_87:    \t2013:05:30,16:49:05:406\x00\n',
 'Stamp_88:    \t2013:05:30,16:49:05:796\x00\n',
 'Stamp_89:    \t2013:05:30,16:49:06:203\x00\n',
 'Stamp_90:    \t2013:05:30,16:49:06:593\x00\n',
 'Stamp_91:    \t2013:05:30,16:49:07:00\x00\n',
 'Stamp_92:    \t2013:05:30,16:49:07:406\x00\n',
 'Stamp_93:    \t2013:05:30,16:49:07:796\x00\n',
 'Stamp_94:    \t2013:05:30,16:49:08:203\x00\n',
 'Stamp_95:    \t2013:05:30,16:49:08:609\x00\n',
 'Stamp_96:    \t2013:05:30,16:49:09:00\x00\n',
 'Stamp_97:    \t2013:05:30,16:49:09:406\x00\n',
 'Stamp_98:    \t2013:05:30,16:49:09:812\x00\n',
 'Stamp_99:    \t2013:05:30,16:49:10:203\x00\n',
 'Stamp_100:    \t2013:05:30,16:49:10:609\x00\n',
 'Stamp_101:    \t2013:05:30,16:49:11:15\x00\n',
 'Stamp_102:    \t2013:05:30,16:49:11:406\x00\n',
 'Stamp_103:    \t2013:05:30,16:49:11:812\x00\n',
 'Stamp_104:    \t2013:05:30,16:49:12:218\x00\n',
 'Stamp_105:    \t2013:05:30,16:49:12:609\x00\n',
 'Stamp_106:    \t2013:05:30,16:49:13:15\x00\n',
 'Stamp_107:    \t2013:05:30,16:49:13:421\x00\n',
 'Stamp_108:    \t2013:05:30,16:49:13:812\x00\n',
 'Stamp_109:    \t2013:05:30,16:49:14:218\x00\n',
 'Stamp_110:    \t2013:05:30,16:49:14:625\x00\n',
 'Stamp_111:    \t2013:05:30,16:49:15:15\x00\n',
 'Stamp_112:    \t2013:05:30,16:49:15:421\x00\n',
 'Stamp_113:    \t2013:05:30,16:49:15:828\x00\n',
 'Stamp_114:    \t2013:05:30,16:49:16:218\x00\n',
 'Stamp_115:    \t2013:05:30,16:49:16:625\x00\n',
 'Stamp_116:    \t2013:05:30,16:49:17:31\x00\n',
 'Stamp_117:    \t2013:05:30,16:49:17:421\x00\n',
 'Stamp_118:    \t2013:05:30,16:49:17:828\x00\n',
 'Stamp_119:    \t2013:05:30,16:49:18:234\x00\n',
 'Stamp_120:    \t2013:05:30,16:49:18:625\x00\n',
 'Stamp_121:    \t2013:05:30,16:49:19:31\x00\n',
 'Stamp_122:    \t2013:05:30,16:49:19:437\x00\n',
 'Stamp_123:    \t2013:05:30,16:49:19:828\x00\n',
 'Stamp_124:    \t2013:05:30,16:49:20:234\x00\n',
 '\x00\n',
 'LUT DESCRIPTION #1\x00\n',
 'LUT_0\x00\n',
 'Name:                   \tRed\x00\n',
 'Inverted (1=yes / 0=no):\t0\x00\n',
 '\x00\n',
 'SEQUENTIAL INFORMATION #1\x00\n',
 'Sequence Count: \t0\x00\n',
 '\x00\n',
 'IMAGES INFORMATION #1\x00\n',
 'Number of Images: \t125\x00\n',
 'Image Width:      \t256\x00\n',
 'Iamge Length:     \t256\x00\n',
 'Bits per Sample:  \t8\x00\n',
 'Samples per Pixel:\t1\x00\n']

You'll notice the strange \x00\n at the end of each line. This, again, is an old encoding for carriage returns. It it not at all uncommon that you will have to deal with such annoyances when parsing files. They are of no concern to us, since we will just strip them off. While we're at it, we'll also split the line up at white spaces useing the split() method of strings.

In [14]:
for i, line in enumerate(lines):
    lines[i] = line.rstrip('\x00\n').split()

Now, we'll search through the list of strings until we find the

Series Name:    Series006

line. This just means that we want to find the index that has entry

['Series', 'Name:', 'Series006']  

We can use the index() method of lists to do this.

In [15]:
# Find index where Series is identified
i_start = lines.index(['Series', 'Name:', 'Series006'])

# Show us!
i_start
Out[15]:
198

Extracting interpixel distances

Now, we can scan starting that this starting index and extract the voxel sizes.

In [16]:
for i in range(i_start, len(lines)):
    if len(lines[i]) > 0:
        if lines[i][0] == 'Voxel-Width':
            physical_size_x = lines[i][2]
        elif lines[i][0] == 'Voxel-Height':
            physical_size_y = lines[i][2]
        
# Pixel sizes!
print('Interpixel spacing in x (µm):', physical_size_x)
print('Interpixel spacing in y (µm):', physical_size_y)
Interpixel spacing in x (µm): 0.310171
Interpixel spacing in y (µm): 0.310171

Extracting the time points

Now, we want to get the time points for our images. The time stamp records look line this:

Stamp_0:        2013:05:30,16:48:30:515

So, we want to find the stamps, set the time for frame 0 to be zero seconds, and then compute the time difference for all of the other frames. We learned how to do this with the datetime module.

Our strategy is to convert the time stamp string into a datetime.datetime object and the subtract the first time from the others. First, we'll write a function to convert time strings to a list of numbers.

In [17]:
def time_str_to_datetime(time_str):
    """
    Convert date/time string in Leica file to 
    datetime.datetime object
    """
    
    # Split at colons and comma
    splitter = re.compile('[,|:]')

    # Split the time_str
    str_list = splitter.split(time_str)
    
    # Convert to integers
    for i, num in enumerate(str_list):
        str_list[i] = int(num)
        
    # Return datetime.datetime object
    return datetime.datetime(*tuple(str_list))

Now that we have this function, we can scan through and find the time stamps.

In [18]:
# Scan until we get to the first time stamp
i = i_start
while len(lines[i]) == 0 or lines[i][0] != 'Stamp_0:':
    i += 1
    
# Extract the time string
t_0 = time_str_to_datetime(lines[i][1])

# Loop through successive frames and get the time points
time_points = [0]
i += 1
while len(lines[i]) > 1 and lines[i][0][:5] == 'Stamp':
    delta_t = time_str_to_datetime(lines[i][1]) - t_0
    time_points.append(delta_t.total_seconds())
    i += 1

Let's look at our list and see how we did.

In [19]:
time_points
Out[19]:
[0,
 0.000406,
 0.999797,
 1.000203,
 1.999594,
 2.0,
 2.000406,
 2.999797,
 3.000203,
 3.99961,
 4.0,
 4.000406,
 4.999813,
 5.000203,
 5.99961,
 6.000016,
 6.000406,
 6.999813,
 7.000219,
 7.99961,
 8.000016,
 8.000422,
 8.999813,
 9.000219,
 9.99961,
 10.000016,
 10.000422,
 10.999813,
 11.000235,
 11.999625,
 12.000016,
 12.000422,
 12.999828,
 13.000219,
 13.999625,
 14.000031,
 14.000422,
 14.999828,
 15.000235,
 15.999625,
 16.000031,
 16.000438,
 16.999828,
 17.000235,
 17.999641,
 18.000031,
 18.000438,
 18.999844,
 19.000235,
 19.999641,
 20.000047,
 20.000438,
 20.999844,
 21.00025,
 21.999641,
 22.000047,
 22.000453,
 22.999844,
 23.00025,
 23.999656,
 24.000047,
 24.000453,
 24.99986,
 25.00025,
 25.999656,
 26.000063,
 26.000453,
 26.99986,
 27.000266,
 27.999656,
 28.000063,
 28.000469,
 28.99986,
 29.000266,
 29.999672,
 30.000063,
 30.000469,
 30.999875,
 31.000266,
 31.999672,
 32.000078,
 32.000469,
 32.999875,
 33.000281,
 33.999672,
 34.000078,
 34.999485,
 34.999891,
 35.000281,
 35.999688,
 36.000078,
 36.999485,
 36.999891,
 37.000281,
 37.999688,
 38.000094,
 38.999485,
 38.999891,
 39.000297,
 39.999688,
 40.000094,
 40.9995,
 40.999891,
 41.000297,
 41.999703,
 42.000094,
 42.9995,
 42.999906,
 43.000297,
 43.999703,
 44.00011,
 44.9995,
 44.999906,
 45.000313,
 45.999703,
 46.00011,
 46.999516,
 46.999906,
 47.000313,
 47.999719,
 48.00011,
 48.999516,
 48.999922,
 49.000313,
 49.999719]

Great! We managed to extract the important metadata, we we could then use in our image analysis. We also now have nicely organized files that appear in alphabetical order.