{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lesson 42: The Jupyter notebook" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*This lesson was generated from a Jupyter notebook. You can download the notebook [here](l42_jupyter.ipynb).*" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ " \n", " \n", " \n", " \n", "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Our numerical workhorses\n", "import numpy as np\n", "import pandas as pd\n", "\n", "# Import pyplot for plotting\n", "import matplotlib.pyplot as plt\n", "\n", "# Seaborn, useful for graphics\n", "import seaborn as sns\n", "\n", "# Import Bokeh modules for interactive plotting\n", "import bokeh.io\n", "import bokeh.mpl\n", "import bokeh.plotting\n", "\n", "# Magic function to make matplotlib inline; other style specs must come AFTER\n", "%matplotlib inline\n", "\n", "# This enables SVG graphics inline (only use with static plots (non-Bokeh))\n", "%config InlineBackend.figure_format = 'svg'\n", "\n", "# JB's favorite Seaborn settings for notebooks\n", "rc={'lines.linewidth': 2, 'axes.labelsize': 18, 'axes.titlesize': 18, \n", " 'axes.facecolor': 'DFDFE5'}\n", "sns.set_context('notebook', rc=rc)\n", "\n", "# Set up Bokeh for inline viewing\n", "bokeh.io.output_notebook()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this tutorial, you will learn the basics on how to use Jupyter notebooks. It will be useful for you to go over [Tutorial 0c from my data analysis class](http://bebi103.caltech.edu/2015/tutorials/t0c_intro_to_latex.html) to learn how to use $\\LaTeX$ in your Jupyter notebooks. \n", "You should, of course, read [the official Jupyter documentation](http://jupyter-notebook.readthedocs.org/) as well.\n", "\n", "There are many sections to this lesson, so I provide a table of contents." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Contents\n", "* [What is Jupyter](#What-is-Jupyter?)\n", "* [Launching a Jupyter notebook](#Launching-a-Jupyter-notebook)\n", "* [Cells](#Cells)\n", "* [Code cells](#Code-cells)\n", " - [Display of graphics](#Display-of-graphics)\n", " - [Interactive plotting with Bokeh](#Interactive-plotting-with-Bokeh)\n", " - [Proper formatting of cells](#Proper-formatting-of-cells)\n", " - [Best practices for code cells](#Best-practices-for-code-cells)\n", "* [Markdown cells](#Markdown-cells)\n", "* [Styling your notebook](#Styling-your-notebook)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What is Jupyter?\n", "[Jupyter](http://jupyter.org) is a way to combine text (with math!) and code (which runs and can display graphic output!) in an easy-to-read document that renders in a web browser. The notebook itself is stored as a text file in [JSON](http://json.org) format. This text file is what you will email the course staff when submitting your homework.\n", "\n", "It is language agnostic as its name suggests. The name \"Jupyter\" is a combination of [Julia](http://julialang.org/) (a new language for scientific computing), [Python](http://python.org/) (which you know and love, or at least will when the course is over), and [R](https://www.r-project.org) (the dominant tool for statistical computation). However, you currently can run over 40 different languages in a Jupyter notebook, not just Julia, Python, and R." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Launching a Jupyter notebook\n", "A Jupyter was spawned from the IPython project, Jupyter notebooks are still launched under the old name, \"IPython notebook.\" To launch a Jupyter notebook, you can do the following.\n", "* **Mac**: Use the Anaconda launcher and select Jupyter notebook.\n", "* **Windows**: Under \"Search programs and files\" from the Start menu, type `jupyter notebook` and select \"Jupyter notebook.\"\n", "\n", "A Jupyter notebook will then launch in your default web browser.\n", "\n", "You can also launch Jupyter from the command line. To do this, simply enter\n", "\n", " jupyter notebook\n", "\n", "on the command line and hit enter. This also allows for greater flexibility, as you can launch Jupyter with command line flags. For example, I launch Jupyter using\n", "\n", " jupyter notebook --browser=safari\n", "\n", "This fires up Jupyter with Safari as the browser. If you launch Jupyter from the command line, your shell will be occupied with Jupyter and will occasionally print information to the screen. After you are finished with your Jupyter session (and have saved everything), you can kill Jupyter by hitting \"`ctrl + C`\" in the terminal/PowerShell window.\n", "\n", "When you launch Jupyter, you will be presented with a menu of files in your current working directory to choose to edit. You can also navigate around the files on your computer to find a file you wish to edit by clicking the \"Upload\" button in the upper right corner. You can also click \"New\" in the upper right corner to get a new Jupyter notebook. After selecting the file you wish to edit, it will appear in a new window in your browser, beautifully formatted and ready to edit." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Cells\n", "A Jupyter notebook consists of **cells**. The two main types of cells you will use are **code cells** and **markdown cells**, and we will go into their properties in depth momentarily. First, an overview.\n", "\n", "A code cell contains actual code that you want to run. You can specify a cell as a code cell using the pulldown menu in the toolbar in your Jupyter notebook. Otherwise, you can can hit `esc` and then `y` (denoted \"`esc, y`\") while a cell is selected to specify that it is a code cell. Note that you will have to hit enter after doing this to start editing it.\n", "\n", "If you want to execute the code in a code cell, hit \"`shift + enter`.\" Note that code cells are executed in the order you execute them. That is to say, the ordering of the cells for which you hit \"`shift + enter`\" is the order in which the code is executed. If you did not explicitly execute a cell early in the document, its results are now known to the Python interpreter.\n", "\n", "Markdown cells contain text. The text is written in **markdown**, a lightweight markup language. You can read about its syntax [here](http://daringfireball.net/projects/markdown/syntax). Note that you can also insert HTML into markdown cells, and this will be rendered properly. As you are typing the contents of these cells, the results appear as text. Hitting \"`shift + enter`\" renders the text in the formatting you specify.\n", "\n", "You can specify a cell as being a markdown cell in the Jupyter toolbar, or by hitting \"`esc, m`\" in the cell. Again, you have to hit enter after using the quick keys to bring the cell into edit mode.\n", "\n", "In general, when you want to add a new cell, you can use the \"Insert\" pulldown menu from the Jupyter toolbar. The shortcut to insert a cell below is \"`esc, b`\" and to insert a cell above is \"`esc, a`.\" Alternatively, you can execute a cell and automatically add a new one below it by hitting \"`alt + enter`.\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Code cells\n", "Below is an example of a code cell printing `hello, world.` Notice that the output of the print statement appears in the same cell, though separate from the code block." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "hello, world.\n" ] } ], "source": [ "# Say hello to the world.\n", "print('hello, world.')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you evaluate a Python expression that returns a value, that value is displayed as output of the code cell. This only happens, however, for the last line of the code cell." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "11" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Would show 9 if this were the last line, but it is not, so shows nothing\n", "4 + 5\n", "\n", "# I hope we see 11.\n", "5 + 6" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note, however, if the last line does not return a value, such as if we assigned a variable, there is no visible output from the code cell." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Variable assignment, so no visible output.\n", "a = 5 + 6" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "11" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# However, now if we ask for a, its value will be displayed\n", "a" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the next sections, we will demonstrate some plotting in Jupyter, so we will load in the `DataFrame` of Darwin finch data as a demo. We will use the 1987 data." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Load in DataFrame\n", "df = pd.read_csv('../data/grant/1987.csv', comment='#')\n", "\n", "# Change labels\n", "df.columns = ['band', 'species', 'beak_length', 'beak_depth']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Display of graphics\n", "When displaying graphics, you should have them **inline**, meaning that they are displayed directly in the IPython notebook and not in a separate window. You can specify that, as I did at the top of this document, using the `%matplotlib inline` magic function. Below is an example of graphics displayed inline.\n", "\n", "Generally, I prefer presenting graphics as scalable vector graphics (SVG). Vector graphics are infinitely zoom-able; i.e., the graphics are represented as points, lines, curves, etc., in space, not as a set of pixel values as is the case with raster graphics (such as PNG). By default, graphics are displayed as PNGs, but you can specify SVG as I have at the top of this document in the first code cell. Unfortunately, there seems to be a bug, at least when I render in Safari, where vertical and horizontal lines are not properly rendered when using SVG. For some reason, when I select next cell and convert it to a code cell and back to markdown, the lines are then properly rendered. This is annoying, but I tend to think it is worth it to have nice SVG graphics. On the other hand, PNG graphics will usually suffice if you want to use them in your homework." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "