{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Lesson 27: Numpy arrays and operations with them\n",
"\n",
"(c) 2018 Justin Bois. With the exception of pasted graphics, where the source is noted, this work is licensed under a [Creative Commons Attribution License CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/). All code contained herein is licensed under an [MIT license](https://opensource.org/licenses/MIT).\n",
"\n",
"This document was prepared at [Caltech](http://www.caltech.edu) with financial support from the [Donna and Benjamin M. Rosen Bioengineering Center](http://rosen.caltech.edu).\n",
"\n",
"\n",
"\n",
"*This tutorial was generated from a Jupyter notebook. You can download the notebook [here](l27_numpy_arrays.ipynb).*\n",
"\n",
"
"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import altair as alt\n",
"\n",
"import bootcamp_utils"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We just got an introduction to NumPy and SciPy. The packages are extensive. At the center is the NumPy array data type. We will explore this data type in this tutorial. It is worth noting that under the hood of many of the operations we do with Pandas `DataFrame`s are done with NumPy arrays. As you understand how NumPy arrays work, you will also better understand what Pandas is doing.\n",
"\n",
"As it is always more fun to work with a real biological application, we will populate our NumPy arrays with data. In their 2011 [paper in PLoS ONE](http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0025840), Harvey and Orbidans measured the cross-sectional area of *C. elegans* eggs that came from mothers who had a high concentration of food and from mothers of a low concentration of food. I digitized the data from their plots, and they are available in the file `data/c_elegans_egg_xa.csv` in the bootcamp repository."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extracting Numpy arrays from Pandas data frames\n",
"\n",
"NumPy has a primitive function for loading in data from text files, `np.loadtxt()`, but with Panda's `read_csv()`, there is really no reason to ever use it. So, we will load in the (tidy) data using Pandas."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n", " | food | \n", "area (sq. um) | \n", "
---|---|---|
0 | \n", "high | \n", "1683 | \n", "
1 | \n", "high | \n", "2061 | \n", "
2 | \n", "high | \n", "1792 | \n", "
3 | \n", "high | \n", "1852 | \n", "
4 | \n", "high | \n", "2091 | \n", "