Lesson 2: Basic command line skills

This tutorial was generated from a Jupyter notebook. You can download the notebook here.

Perhaps the first step toward really empowering you to command your computer to do whatever you will is to learn how to use the command line. This lesson provides a brief introduction to command line skills.

What is a shell?

A shell is a program that takes commands from files or entered interactively through the keyboard and passes them on to the operating system to be executed. A shell is accessed through a terminal or terminal emulator.

A very brief historical overview

Ken Thompson of Bell Labs developed the first shell for UNIX called V6 in 1971. In 1977, Stephen Bourne introduced the Bourne shell (sh) which added the ability to invoke scripts (small reusable programs) from within the shell. The Bourne shell remains relevant. In some cases it's still the default root shell. Shortly afterwards, the C shell (csh) was developed which made use of a C-like scripting language. tcsh is built on csh and is still very common. bash, the Bourne again shell was developed by Brian Fox to replace the Bourne shell. It adds many useful features to sh and is the default shell on Macs and several Linux distributions. The Z shell (zsh) combines useful features from a number of shells and is worth checking out. Since it usually requires installation, we will stick with bash probably the most commonly used shell, for this lesson.

Getting started with bash

Before we get started, make sure you have downloaded this ZIP file, saved it somewhere in your home directory, and have unzipped it.

Now, as you did in lesson 0, Mac and Linux users open your terminal; Windows users, open Git Bash.

Fixed-width text in it's own cell in the lesson contain commands you should enter at the command line. Let's start out by using the pwd command to figure out what directory we're in.

In [ ]:
pwd

pwd tells you the path of your current directory. A path for a directory or file is the list of all its parent directories, separated by slashes (/), up to the root directory signified by the initial \. You are probably in your home directory.

To list all files and folders in the current directory, we employ the ls command.

In [ ]:
ls

cd, change directory:

Let's make sure we are in the home directory:

In [ ]:
cd

Use pwd to check where you are now. Invoking the cd command without specifying a target directory defaults to the home directory.

mkdir

Let's make a directory (mkdir) where you will keep all of your bootcamp materials.

In [ ]:
mkdir bootcamp

Let's move to that directory and then create a new directory to house the contents of the ZIP file you downloaded.

In [ ]:
cd bootcamp
mkdir command_line_tools

Please note, there are no spaces in the directory name we just created. In general, you should avoid spaces in directory names, even though your operating system often has them in there.

Now let's go to the new command_line_tools directory.

In [ ]:
cd command_line_tools

We and take a look at what is in there.

In [ ]:
ls

And this shows that the directory is empty. Let's go back to the bootcamp directory. We can go up one directory using ../. This is an example of a relative path. The current directory is "./", "../../" is two directories up, "../../../" is three directories up, and so on.

In [ ]:
cd ../

We're now back in the bootcamp directory. Check this with pwd.

Finding things

Earlier we downloaded a ZIP file and unpacked it. The folder name is command_line_tutorial, but where is it? To find it, we can use the find command. First, let's go back to our home directory.

In [ ]:
cd

Now, we'll use find to get get the full path to the command_line_tutorial folder.

In [ ]:
find . -name "command_line_tutorial"

The first argument to find is the parent directory you want to seach. In this case, "." means we want to search the current directory an all subdirectories. The -name flag says to look for something named "command_line_tutorial." find then goes through all the directories under the current directory to look for something called "command_line_tutorial" and prints its path to the terminal (this is called standard out or stdout). Now that we know where to find our folder, we can put it in our bootcamp folder. Since the path to the folder will be different for all of the students, we will use /path/to/folder/ as a proxy.

Some of you may get a lot of "Permission denied" warnings. We won't get into what that means now, but if you want to suppress all of the warnings and only print the found paths to the screen, you can do the following.

In [ ]:
find . -name "command_line_tutorial" 2>/dev/null

When find prints a warning, it sends the message to a special stream, "standard error" or stderr. Bash lets us specifically deal with this stream by using 2>. /dev/null is a special stream called the null device, basically a dumping ground for trash. For more on direction in bash, see this page.

Moving files and directories: mv

Now that we know the location of the downloaded folder we can move it to the bootcamp folder. We'll first go to the bootcamp folder. If you are not sure what directory you are in, you can always use the ~ abbreviation for your home directory. So, to get to the bootcamp directory, do the following.

In [ ]:
cd ~/bootcamp

Now that we're in the bootcamp directory, we'll move the folder containing the materials for the command line lesson into the bootcamp directory. Remember that /path/to/folder/ is the path you found using find.

In [ ]:
mv /path/to/folder/command_line_tutorial ./

The mv command moves files and folders from one place to another. The expression following the mv command is the complete path to the file/folder and the second expression is the path of the destination. The "./" is an abbreviation of the current directory.

Note the mv command moves stuff. The folder command_line_tutorial is now in the current directory and no longer in /path/to/folder. Check for yourself:

In [ ]:
ls 

and:

In [ ]:
ls /path/to/folder

Copying files and directories: cp

If you want to retain a copy of the folder/file in the original folder you can use the copy command cp. It works straight forwardly with files. Applied to directories it requires a flag: cp -r, meaning "recursive."

Let's have a look at the cp command in action.

First we check again what we have the command_line_tutorial folder in this directory by issuing the ls command. You should see that our previously empty directory now contains the command_line_tutorial folder.

In [ ]:
cd command_line_tutorial

Now, let's look at the contents in this directory.

In [ ]:
ls

This directory contains the file some structure.pdb, which has a space in its name. That's not good practice, so we better change it. But in some cases this is not possible. For example Google Drive creates a folder called "Google Drive". The space character has a meaning for the terminal and its occurrence in file and directory names leads to unwanted behaviour.

Renaming the file some structure.pdb requires us to escape the space character. This won't work:

In [ ]:
mv some structure.pdb some_structure.pdb

But this will:

In [ ]:
mv some\ structure.pdb some_structure.pdb

Ok, with that fixed, we can now give cp a whirl. We will copy the .xml file to a file called junk, just to demonstrate how cp works.

In [ ]:
cp mpstrucAlphaHlxTbl.xml junk

Now, doing ls will show that the file junk now also exists. It is a copy of mpstrucAlphaHlxTbl.xml file. Applying the exact same command to the directory pdbs won't work.

In [ ]:
cp pdbs pdbs_junk

However, as mentioned before, it will work with the -r flag.

In [ ]:
cp -r pdbs pdbs_junk

Some shortcuts

Now is a good opportunity to introduce some shortcuts. First, let's look at tab completion. Type the following

In [ ]:
ls seq

and press the tab key. If there isn't anything else starting with seq, tabbing will autocomplete the term seq to sequences. If there is something else in your directory starting with "seq" you'll get a list of all these items.

Now, just for fun (and to illustrate another shortcut), let's look at what is in the sequences directory.

In [ ]:
ls sequences

Great! We like what we see. Now, let's cd into that directory. Check this out... just type this:

In [ ]:
cd !$

The !$ expression equals the last "word" of the previous line. It's a convenient trick. You could of course also just rewrite the whole line or make use of tab-completion.

This is a good opportunity to introduce a few more time savers. Instead of retyping the command, we can use the arrows up and down keys to navigate between previously used commands. For example by pressing the up arrow, once we find :

In [ ]:
cp pdbs pdbs_junk

we can edit the command accordingly (by including a -r). To edit commands, it's useful to know that ctrl+a and ctrl+e are quick ways to navigate to the beginning and end of the line. esc+b and esc+f move the cursor one word back or forward respectively. crtl+w deletes the word preceding the cursor. As you spend more time on the command line, these tricks will prove to be big time savers.

Remove files and directories, rm

rm deletes files and directories. It works like cp. Before we use it, let's make sure we're in the ~/bootcamp/command_line_tutorial directory.

In [ ]:
cd ~/bootcamp/command_line_tutorial

We have some junk files laying around that we need to clean up. Let's first remove the file junk.

In [ ]:
rm junk

To remove a directory, we need to use the -r flag.

In [ ]:
rm -r pdbs_junk

Warning: rm is a wrecking ball. It will destroy any files you have that do not have restrictive permissions. This is so important, I will put it in red.

`rm` is unforgiving

Therefore, I always like to use the -i flag, which means that rm will ask me if I'm sure before deletion.

In [ ]:
rm -r -i some_directory

Sometimes rm will prompt you before deleting files in a directory with certain permissions. Some choose to avoid this my specifying the -f, or force, flag. This should be used with a great deal of caution becuase ...

`rm` is unforgiving
In [ ]:
# BE CAREFUL
rm -r -f some_directory

Wild card characters

Wild card characters can be quite useful. We use the * character to specify that anything can go where the * is. For example, let's say we wanted to list all fasta files in the ~/bootcamp/command_line_tutorial/sequences directory (and not the README file). We would do the following.

In [ ]:
cd ~/bootcamp/command_line_tutorial/sequences
ls *.fasta

Relative paths

We already talked about relative paths, but let's play with them a bit. cd into the ~/bootcamp/command_line_tutorial/pdbs directory. Then,

In [ ]:
ls ..

will show you the contents one directory up and

In [ ]:
ls ../.. 

will show you the contents two directories up.

To check out the sequences directory, we go up one directory and then enter the sequences directory from there:

In [ ]:
ls ../sequences 

Exploring file content

Bash offers various ways to display the content of files. Let's try them with some of the pdb files in the pdbs folder. First, let's look at the file (making sure you're in the ~/bootcamp/command_line_tutorial/pdbs directory).

In [ ]:
ls

We'll look at the aquaporin in the file 1z98.pdb. There are lots of ways to do it. We'll start with more.

In [ ]:
more 1z98.pdb

This way of looking at a file allows paging through text one screenful at at time. Use the space bar to display more. It's not possible to go back on some systems (it is possible to go back in Mac OS X). Note that more might be unavailable if you are using Git Bash on Windows.

We'll now look at serveral other ways to look at files. Just substitute them for more in the above command.

less

Like more but allows using the arrow up and arrow down keys traverse up or down by line. It also allows scrolling by touchpad or mouse. Since it doesn't require the whole file to be read before displaing the top content, it's ideal for larger files. It also supports searching initiated by "/" followed by the query; shift+G will go to the end of the file; gg to the beginning; and you can specify a line number too by ":" followed by the line number.

cat

cat prints the entire file to the standard output (terminal). This is especially useful if the files are very small.

head just prints the top lines of the file to the standard output. The default can be changed:

head -5 

This will print the first 5 lines to the standard output

tail

Like head, but for the last lines of the file.

You are now already equiped to manage files and navigate your way around the command line! My computer runs Mac OS X. I very rarely use Finder to copy, move, or even read files. I do it all on the command line. Once you get the hang of it, you will find the command line very efficient.