(c) 2016 Justin Bois and Axel Müller. This work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.
This tutorial was generated from a Jupyter notebook. You can download the notebook here.
Perhaps the first step toward really empowering you to command your computer to do whatever you will is to learn how to use the command line. This lesson provides a brief introduction to command line skills.
A shell is a program that takes commands from files or entered interactively through the keyboard and passes them on to the operating system to be executed. A shell is accessed through a terminal or terminal emulator.
Ken Thompson of Bell Labs developed the first shell for UNIX called V6 in 1971. In 1977, Stephen Bourne introduced the Bourne shell (sh
) which added the ability to invoke scripts (small reusable programs) from within the shell. The Bourne shell remains relevant. In some cases it's still the default root shell. Shortly afterwards, the C shell (csh
) was developed which made use of a C-like scripting language. tcsh
is built on csh
and is still very common. bash
, the Bourne again shell was developed by Brian Fox to replace the Bourne shell. It adds many useful features to sh and is the default shell on Macs and several Linux distributions. Future versions of Windows will also have bash
, and Git-Bash uses bash
as well.
The Z shell (zsh
) combines useful features from a number of shells and is worth checking out. It is my (JB's) shell of choice. Since it usually requires installation, we will stick with bash
, probably the most commonly used shell, for the bootcamp.
bash
: pwd
and ls
¶Now, as you did in lesson 0, Mac and Linux users open your terminal; Windows users, open Git-Bash.
As we go through this tutorial, any text in a code cell contains commands you should enter at the command line. Let's start out by using the pwd
command to figure out what directory we're in.
pwd
pwd
tells you the path of your current directory. A path for a directory or file is the list of all its parent directories, separated by slashes (/
), up to the root directory signified by the initial /
. You are probably in your home directory.
To list all files and folders in the current directory, we employ the ls
command.
ls
Let's make sure we are in the home directory:
cd
Use pwd
to check where you are now. Invoking the cd
command without specifying a target directory defaults to the home directory. Another way to specify your home directory is by its shortcut, ~/
. In general, the tilde-slash means "home directory."
git
and mkdir
¶We will be using git
a lot starting with the second day of the bootcamp, and will in fact use it right now to set things up for you for this tutorial. Put briefly, git
is a version control system that allows multiple programmers to work together. You are going to use it in just a moment to get all of the code your need for the bootcamp on your machine organized exactly as I have it organized on my machine.
In my work, I like to have a directory in my home directory called git
that has all of the code from git
repositories that I work on. We'll have you do the same, though you are welcome to change how you organize things after the bootcamp. You will make a directory to house all of your repositories.
The mkdir
command creates an empty directory, hence the name make directory. So, let's use it to make your git
directory.
mkdir git
Let's move to that directory.
cd git
Look at what is in the directory using ls
.
ls
You will find there is nothing there. It is an empty directory.
While this is a very simple name for a directory, note that there are no spaces in it. In general, you should avoid spaces in directory names, even though your operating system often has them in there. Trust me on this, they can make things a total mess, especially on the command line, since a space also separates commands.
Ok, now it is time to pull in all of the code and examples we have prepared for you. To do this, use git
to clone the bootcamp repository.
git clone "https://github.com/justinbois/bootcamp/"
Now, let's look again at the contents of the directory.
ls
We now see a directory called bootcamp
. This is the repository we have set up for you.
mv
: renaming files¶Let's venture into the bootcamp
directory and see what's there.
cd bootcamp
ls
Ahoy! There is a directory called command_line_tutorial
. Since we're currently doing a command line tutorial, let's go into that directory and see what is there.
cd command_line_tutorial
ls
We see that we have a directory called sequences
, as well as a FASTA file named some sequence.fasta
. This file name has the annoying space in it. We would like to rename it something without a space, say some_sequence.fasta
. To do this, we us the mv
command, short for "move." We enter mv
, followed by the name of the file we want to rename, and then its new name.
mv some sequence.fasta some_sequence.fasta
Uh-oh! That gave us some strange output, talking about the usage of mv
. This is because the space in the file some sequence.fasta
was interpreted as a gap between arguments of the mv
command. To specify that the space is part of the file name, we need to use an escape character
, \
. The space following the escape character is not considered as an argument separator. This works:
mv some\ sequence.fasta some_sequence.fasta
Now, we probably want this file in the sequences
directory. We can also move files into directories (without changing their file names) using the mv
command.
mv some_sequence.fasta sequences/
ls
The trailing slash is not necessary, but I always include it out of habit to remind myself that I am moving a file to a directory.
Now let's go into the sequences
directory and see what we have.
cd sequences
ls
We see that some_sequence.fasta
is there, along with other FASTA files.
We would like to see what is in the sequence files. Bash offers various ways to display the content of files. We'll look at the genome of the dengue virus in the file dengue.fasta
. There are lots of ways to do it. We'll start with less
. It got its name because it is more feature-rich than more
, which was used to look at files. ("less
is more
," get it?) It allows using the arrow up and arrow down keys traverse up or down by line. It also allows scrolling by touchpad or mouse. Since it doesn't require the whole file to be read before displaying the top content, it's ideal for larger files. It also supports searching initiated by "/" followed by the query; shift+G
will go to the end of the file; gg
to the beginning; and you can specify a line number too by ":
" followed by the line number.
less dengue.fasta
We'll now look at several other ways to look at files. Just substitute them for less
in the above command.
cat
¶cat
prints the entire file to the standard output (terminal). This is especially useful if the files are very small.
head
¶head
just prints the top lines of the file to the standard output. The default can be changed:
head -5
This will print the first 5 lines to the standard output
tail
¶Like head
, but for the last lines of the file.
cp
¶If you want to retain a copy of the folder/file in the original folder you can use the copy command cp
. It works straightforwardly with files. Applied to directories it requires a flag: cp -r
, meaning "recursive." A flag typically begins with a hyphen (-
) and gives the command some extra directions on how you want to do things. In this case, we are telling cp
to work recursively.
Let's have a look at the cp
command in action.
cp dengue.fasta copy_of_dengue.fasta
Maybe we want a copy of the entire sequences
directory. To do that, we will cd
one directory up to the command_line_tutorial
directory.
cd ../
We went up one directory using ../
. This is an example of a relative path. The current directory is "./
", "../../
" is two directories up, "../../../
" is three directories up, and so on. This is very very useful when navigating directory structures. Now let's try copying an entire directory with the -r
flag.
cp -r sequences copy_of_sequences
We can also rename directories with the mv
command. Let's rename the copy_of_sequences
to sequences_copy
. This is silly, but illustrates how things work.
mv copy_of_sequences sequences_copy
rm
¶Yes, some of the things we just did are silly. We have no need to having a copy of a given sequence or a copy of the whole sequences directory. We can clean things up by deleting them. First, let's get rid of our copy of the dengue sequence. First, let's cd
into the sequences directory and make sure its there.
cd sequences
ls
Now let's remove the file and verify it is gone.
rm copy_of_dengue.fasta
ls
And poof!, its gone! And I mean gone. It is pretty much irrecoverable. Warning: rm
is a wrecking ball. It will destroy any files you have that do not have restrictive permissions. This is so important, I will put it in red.
Therefore, I always like to use the -i
flag, which means that rm
will ask me if I'm sure before deletion.
rm -i some_sequence.fasta
You will get a prompt. Answer "n
" if you do not want to delete it.
Now, let's use rm
to remove an entire directory. To do this, we need to use the -r
flag.
rm -r sequences_copy
Yes, rm
is a wrecking ball, but we can temper it using the -i
flag. For safety, we would like rm
to always ask us about deletion. We can instruct bash
to do this for us by creating an alias.
alias rm="rm -i"
After executing this, any time we use rm
, bash
will instead execute rm -i
, thereby keeping us out of trouble.
One of my favorite aliases is to make ls
list things more prettily.
alias ls="ls -FG"
The -F
flag makes ls
put a slash at the end of directories. This helps us tell the difference between files and directories. The -G
flag enables coloring of the output, also useful for differentiating file types.
You are now already to manage files and navigate your way around the command line! My computer runs Mac OS X. I very rarely use Finder to copy, move, or even read files. I do it all on the command line. Once you get the hang of it, you will find the command line very efficient.