{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Exercise 5.3: Adding data to a data frame\n", "\n", "<hr>" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<hr>\n", "\n", "We continue working with the frog tongue data. Recall that the header comments in the data file contained information about the frogs." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "# These data are from the paper,\n", "# Kleinteich and Gorb, Sci. Rep., 4, 5225, 2014.\n", "# It was featured in the New York Times.\n", "# http://www.nytimes.com/2014/08/25/science/a-frog-thats-a-living-breathing-pac-man.html\n", "#\n", "# The authors included the data in their supplemental information.\n", "#\n", "# Importantly, the ID refers to the identifites of the frogs they tested.\n", "# I: adult, 63 mm snout-vent-length (SVL) and 63.1 g body weight,\n", "# Ceratophrys cranwelli crossed with Ceratophrys cornuta\n", "# II: adult, 70 mm SVL and 72.7 g body weight,\n", "# Ceratophrys cranwelli crossed with Ceratophrys cornuta\n", "# III: juvenile, 28 mm SVL and 12.7 g body weight, Ceratophrys cranwelli\n", "# IV: juvenile, 31 mm SVL and 12.7 g body weight, Ceratophrys cranwelli\n", "date,ID,trial number,impact force (mN),impact time (ms),impact force / body weight,adhesive force (mN),time frog pulls on target (ms),adhesive force / body weight,adhesive impulse (N-s),total contact area (mm2),contact area without mucus (mm2),contact area with mucus / contact area without mucus,contact pressure (Pa),adhesive strength (Pa)\n", "2013_02_26,I,3,1205,46,1.95,-785,884,1.27,-0.290,387,70,0.82,3117,-2030\n", "2013_02_26,I,4,2527,44,4.08,-983,248,1.59,-0.181,101,94,0.07,24923,-9695\n", "2013_03_01,I,1,1745,34,2.82,-850,211,1.37,-0.157,83,79,0.05,21020,-10239\n", "2013_03_01,I,2,1556,41,2.51,-455,1025,0.74,-0.170,330,158,0.52,4718,-1381\n", "2013_03_01,I,3,493,36,0.80,-974,499,1.57,-0.423,245,216,0.12,2012,-3975\n" ] } ], "source": [ "!head -20 data/frog_tongue_adhesion.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, each frog has associated with it an age (adult or juvenile), snout-vent-length (SVL), body weight, and species (either cross or *cranwelli*). For a tidy data frame, we should have a column for each of these values. Your task is to load in the data, and then add these columns to the data frame. For convenience, here is a data frame with data about each frog." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "df_frog = pd.DataFrame(\n", " data={\n", " \"ID\": [\"I\", \"II\", \"III\", \"IV\"],\n", " \"age\": [\"adult\", \"adult\", \"juvenile\", \"juvenile\"],\n", " \"SVL (mm)\": [63, 70, 28, 31],\n", " \"weight (g)\": [63.1, 72.7, 12.7, 12.7],\n", " \"species\": [\"cross\", \"cross\", \"cranwelli\", \"cranwelli\"],\n", " }\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: There are lots of ways to solve this problem. This is a good exercise in searching through the [Pandas documentation](https://pandas.pydata.org/pandas-docs/stable/) and other online resources, such as [Stack Overflow](https://stackoverflow.com/questions). Remember, much of your programming efforts are spent searching through documentation and the internet." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "<br />" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Solution\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The most direct way is to use [built-in pd.merge() function](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html). This function finds a common column between two `DataFrames`, and then uses that column to merge them, filling in values that match in the common column. This is exactly what we want." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Load the data\n", "df = pd.read_csv('data/frog_tongue_adhesion.csv', comment='#')\n", "\n", "# Perform merge\n", "df = df.merge(df_frog)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at the `DataFrame` to make sure it has what we expect." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "<div>\n", "<style scoped>\n", " .dataframe tbody tr th:only-of-type {\n", " vertical-align: middle;\n", " }\n", "\n", " .dataframe tbody tr th {\n", " vertical-align: top;\n", " }\n", "\n", " .dataframe thead th {\n", " text-align: right;\n", " }\n", "</style>\n", "<table border=\"1\" class=\"dataframe\">\n", " <thead>\n", " <tr style=\"text-align: right;\">\n", " <th></th>\n", " <th>date</th>\n", " <th>ID</th>\n", " <th>trial number</th>\n", " <th>impact force (mN)</th>\n", " <th>impact time (ms)</th>\n", " <th>impact force / body weight</th>\n", " <th>adhesive force (mN)</th>\n", " <th>time frog pulls on target (ms)</th>\n", " <th>adhesive force / body weight</th>\n", " <th>adhesive impulse (N-s)</th>\n", " <th>total contact area (mm2)</th>\n", " <th>contact area without mucus (mm2)</th>\n", " <th>contact area with mucus / contact area without mucus</th>\n", " <th>contact pressure (Pa)</th>\n", " <th>adhesive strength (Pa)</th>\n", " <th>age</th>\n", " <th>SVL (mm)</th>\n", " <th>weight (g)</th>\n", " <th>species</th>\n", " </tr>\n", " </thead>\n", " <tbody>\n", " <tr>\n", " <th>0</th>\n", " <td>2013_02_26</td>\n", " <td>I</td>\n", " <td>3</td>\n", " <td>1205</td>\n", " <td>46</td>\n", " <td>1.95</td>\n", " <td>-785</td>\n", " <td>884</td>\n", " <td>1.27</td>\n", " <td>-0.290</td>\n", " <td>387</td>\n", " <td>70</td>\n", " <td>0.82</td>\n", " <td>3117</td>\n", " <td>-2030</td>\n", " <td>adult</td>\n", " <td>63</td>\n", " <td>63.1</td>\n", " <td>cross</td>\n", " </tr>\n", " <tr>\n", " <th>1</th>\n", " <td>2013_02_26</td>\n", " <td>I</td>\n", " <td>4</td>\n", " <td>2527</td>\n", " <td>44</td>\n", " <td>4.08</td>\n", " <td>-983</td>\n", " <td>248</td>\n", " <td>1.59</td>\n", " <td>-0.181</td>\n", " <td>101</td>\n", " <td>94</td>\n", " <td>0.07</td>\n", " <td>24923</td>\n", " <td>-9695</td>\n", " <td>adult</td>\n", " <td>63</td>\n", " <td>63.1</td>\n", " <td>cross</td>\n", " </tr>\n", " <tr>\n", " <th>2</th>\n", " <td>2013_03_01</td>\n", " <td>I</td>\n", " <td>1</td>\n", " <td>1745</td>\n", " <td>34</td>\n", " <td>2.82</td>\n", " <td>-850</td>\n", " <td>211</td>\n", " <td>1.37</td>\n", " <td>-0.157</td>\n", " <td>83</td>\n", " <td>79</td>\n", " <td>0.05</td>\n", " <td>21020</td>\n", " <td>-10239</td>\n", " <td>adult</td>\n", " <td>63</td>\n", " <td>63.1</td>\n", " <td>cross</td>\n", " </tr>\n", " <tr>\n", " <th>3</th>\n", " <td>2013_03_01</td>\n", " <td>I</td>\n", " <td>2</td>\n", " <td>1556</td>\n", " <td>41</td>\n", " <td>2.51</td>\n", " <td>-455</td>\n", " <td>1025</td>\n", " <td>0.74</td>\n", " <td>-0.170</td>\n", " <td>330</td>\n", " <td>158</td>\n", " <td>0.52</td>\n", " <td>4718</td>\n", " <td>-1381</td>\n", " <td>adult</td>\n", " <td>63</td>\n", " <td>63.1</td>\n", " <td>cross</td>\n", " </tr>\n", " <tr>\n", " <th>4</th>\n", " <td>2013_03_01</td>\n", " <td>I</td>\n", " <td>3</td>\n", " <td>493</td>\n", " <td>36</td>\n", " <td>0.80</td>\n", " <td>-974</td>\n", " <td>499</td>\n", " <td>1.57</td>\n", " <td>-0.423</td>\n", " <td>245</td>\n", " <td>216</td>\n", " <td>0.12</td>\n", " <td>2012</td>\n", " <td>-3975</td>\n", " <td>adult</td>\n", " <td>63</td>\n", " <td>63.1</td>\n", " <td>cross</td>\n", " </tr>\n", " </tbody>\n", "</table>\n", "</div>" ], "text/plain": [ " date ID trial number impact force (mN) impact time (ms) \\\n", "0 2013_02_26 I 3 1205 46 \n", "1 2013_02_26 I 4 2527 44 \n", "2 2013_03_01 I 1 1745 34 \n", "3 2013_03_01 I 2 1556 41 \n", "4 2013_03_01 I 3 493 36 \n", "\n", " impact force / body weight adhesive force (mN) \\\n", "0 1.95 -785 \n", "1 4.08 -983 \n", "2 2.82 -850 \n", "3 2.51 -455 \n", "4 0.80 -974 \n", "\n", " time frog pulls on target (ms) adhesive force / body weight \\\n", "0 884 1.27 \n", "1 248 1.59 \n", "2 211 1.37 \n", "3 1025 0.74 \n", "4 499 1.57 \n", "\n", " adhesive impulse (N-s) total contact area (mm2) \\\n", "0 -0.290 387 \n", "1 -0.181 101 \n", "2 -0.157 83 \n", "3 -0.170 330 \n", "4 -0.423 245 \n", "\n", " contact area without mucus (mm2) \\\n", "0 70 \n", "1 94 \n", "2 79 \n", "3 158 \n", "4 216 \n", "\n", " contact area with mucus / contact area without mucus \\\n", "0 0.82 \n", "1 0.07 \n", "2 0.05 \n", "3 0.52 \n", "4 0.12 \n", "\n", " contact pressure (Pa) adhesive strength (Pa) age SVL (mm) weight (g) \\\n", "0 3117 -2030 adult 63 63.1 \n", "1 24923 -9695 adult 63 63.1 \n", "2 21020 -10239 adult 63 63.1 \n", "3 4718 -1381 adult 63 63.1 \n", "4 2012 -3975 adult 63 63.1 \n", "\n", " species \n", "0 cross \n", "1 cross \n", "2 cross \n", "3 cross \n", "4 cross " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Computing environment" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPython 3.7.7\n", "IPython 7.13.0\n", "\n", "pandas 0.24.2\n", "jupyterlab 1.2.6\n" ] } ], "source": [ "%load_ext watermark\n", "%watermark -v -p pandas,jupyterlab" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.7" } }, "nbformat": 4, "nbformat_minor": 4 }