In this lesson, you will set up a Python computing environment for scientific computing. There are two main ways people set up Python for scientific computing.
In this class, we will use Anaconda, with its associated package manager,
conda. It has recently become the de facto package manager/distribution for scientific use.
We are at an interesting point in Python's history. Python is currently in version 3.5 (as of September 13, 2015). The problem is that Python 3.x is not backwards compatible with Python 2.x. Many scientific packages were written in Python 2.x and have been very slow to update to Python 3. However, Python 3 is Python's present and future, so all packages eventually need to work in Python 3. Today, most important scientific packages work in Python 3. All of the packages we will use do, so we will use Python 3 in this course.
For those of you who are already using Anaconda with Python 2, you can create a Python 3 environment.
Downloading and installing Anaconda is simple.
conda when it is available.
That's it! After you do that, you will have a functioning Python distribution.
During the bootcamp, you will need to access the command line. Doing this on a Mac or Linux is simple. If you are using Linux, it's a good bet you already know how to navigate a terminal, so we will not give specific instructions for Linux. For a Mac, you can fire up the Terminal application. It is typically in the
/Applications/Utilities folder. Otherwise, hold down Command
⌘-space bar and type "terminal" in the search box, and select the Terminal Application.
For Windows, download and install Git Bash. After you have installed it, simply right click anywhere on your Desktop, and you should have an option to run Git Bash (at least that's what happens on Windows 7). You will then have a prompt that looks very much like Mac and Linux users will have.
conda package manager¶
conda is a package manager for keeping all of your packages up-to-date. It has plenty of functionality beyond our basic usage in class, which you can learn more about by reading the docs. We will primarily be using
conda to install and update packages.
conda works from the command line. Now that you know how to get a command line prompt, you can start using
conda. The first thing we'll do is update
conda itself. To do this, enter the following on the command line:
conda update conda
conda is out of date and needs to be updated, you will be prompted to perform the update. Just type
y, and the update will proceeed.
conda is updated, we'll use it to see what packages are installed. Type the following on the command line:
This gives a list of all packages and their versions that are installed. Now, we'll update all packages, so type the following on the command line:
conda update --all
You will be prompted to perform all of the updates. They may even be some downgrades. This happens when there are package conflicts where one package requires an earlier version of another.
conda is very smart and figures all of this out for you, so you can almost always say "yes" (or "
conda when it prompts you.
After you update everything,
conda may tell you to install
anaconda-client. You can go ahead and do this, if it prompts you to, by entering the following on the command line:
conda install anaconda-client
Finally, we will use
conda to install a package that is not included in the standard Anaconda distribution that we would like to use. This will also verify that
conda is working properly on your maching. We will install Seaborn, which is a nice package for data visualization. To do this, type the following on the command line:
conda install seaborn
You will again be prompted to approve the installation. Go for it! Seaborn is pretty cool.
We will also need to install
biopython, a nice package for bioinformatics and sequence analysis. Try:
conda install biopython
You will again be prompted to approve the installation.
Some packages are not available through
conda for various reasons, perhaps because they have not been submitted to the Anaconda developers or are still under nascent development. You can still install these packages and
conda will be aware of them using
pip, short for "PIP installs Python" or "PIP installs packages." One package we will use is pybeeswarm, used for making beeswarm plots. To install pybeeswarm, simply enter the following at the command line
pip install pybeeswarm
This will install
pybeeswarm, and you will be able to import it when you want to use it.
conda is a convenient package manager for many reasons, one being that many packages contain compiled code, and
conda installs binaries, enabling you to skip the ofter troublesome compilation step. As you can imagine, some packages not covered by
conda do need to be compiled on your machine. In order to do that, you need to have compilers installed on your machine that
pip can access to do the compilation. A good way to do this for Macs is to install Developer Tools, which you can get from the App store for free. For Windows you can install the MinGW suite or Visual Studio. We will not use any packages that require compilation outside of
conda, so you do not need to worry about this for the bootcamp.
Anaconda comes with an interactive developer environment (IDE) called Spyder. We will use Spyder while doing our tutorials in class for editing Python scripts and running them in an IPython console.
To start up Spyder, you can either launch it from the command line by typing
or you can do the following.
anaconda folder. Just double click the Launcher application, and you will get a window where you can choose which app to launch. Click Spyder.
Now, launch Spyder and configure the IPython settings by doing the following.
Preferences on a Mac, or
Preferences on Windows). Select
IPython console and then the
You should now be set up with a good IDE to use in the bootcamp. For the first few lessons, we will only be working in the IPython console. IPython is a package that allows for easier interaction with the Python interpreter than the standard interactive Python does. After we get rolling, we'll write code in Spyder's editor window and then run it in the console.
We'll now run a quick test to make sure things are working properly. We will make a quick plot that requires several of the scientific libraries we will use in the bootcamp.
Now, you can go to the new file which should be open in the Spyder editor window. With the exception of the obvious omission, paste the code below into the editor window. You can run the code by clicking on the green arrow on the Spyder toolbar. You may be prompted about run settings. Under "Console," choose "Execute in current Python or IPython console." You may also be prompted to save the file, which you should do, and then it will run.
# Do not enter the next line. This is only to prepare this tutorial.
# Do everything following
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Generate plotting values
t = np.linspace(0, 2 * np.pi, 200)
x = 16 * np.sin(t)**3
y = 13 * np.cos(t) - 5 * np.cos(2 * t) - 2 * np.cos(3 * t) - np.cos(4 * t)
# Generate the plot
plt.plot(x, y, 'r-')
plt.text(0, 0, 'bootcamp', fontsize=36, ha='center')
# These two commands may not be necessary, depending on your configuration.
You should have a window pop up that shows the plot above. In Windows, you may have to click an icon on the bottom tool bar to view it. If you get this plot, excellent! You now have a functioning Python environment for scientific computing!