Exercise 5.3: Adding data to a data frame¶

[1]:

import pandas as pd

We continue working with the frog tongue data. Recall that the header comments in the data file contained information about the frogs.

[2]:

!head -20 data/frog_tongue_adhesion.csv

# These data are from the paper,
#   Kleinteich and Gorb, Sci. Rep., 4, 5225, 2014.
# It was featured in the New York Times.
#    http://www.nytimes.com/2014/08/25/science/a-frog-thats-a-living-breathing-pac-man.html
#
# The authors included the data in their supplemental information.
#
# Importantly, the ID refers to the identifites of the frogs they tested.
#   I:   adult, 63 mm snout-vent-length (SVL) and 63.1 g body weight,
#        Ceratophrys cranwelli crossed with Ceratophrys cornuta
#   II:  adult, 70 mm SVL and 72.7 g body weight,
#        Ceratophrys cranwelli crossed with Ceratophrys cornuta
#   III: juvenile, 28 mm SVL and 12.7 g body weight, Ceratophrys cranwelli
#   IV:  juvenile, 31 mm SVL and 12.7 g body weight, Ceratophrys cranwelli
date,ID,trial number,impact force (mN),impact time (ms),impact force / body weight,adhesive force (mN),time frog pulls on target (ms),adhesive force / body weight,adhesive impulse (N-s),total contact area (mm2),contact area without mucus (mm2),contact area with mucus / contact area without mucus,contact pressure (Pa),adhesive strength (Pa)
2013_02_26,I,3,1205,46,1.95,-785,884,1.27,-0.290,387,70,0.82,3117,-2030
2013_02_26,I,4,2527,44,4.08,-983,248,1.59,-0.181,101,94,0.07,24923,-9695
2013_03_01,I,1,1745,34,2.82,-850,211,1.37,-0.157,83,79,0.05,21020,-10239
2013_03_01,I,2,1556,41,2.51,-455,1025,0.74,-0.170,330,158,0.52,4718,-1381
2013_03_01,I,3,493,36,0.80,-974,499,1.57,-0.423,245,216,0.12,2012,-3975

So, each frog has associated with it an age (adult or juvenile), snout-vent-length (SVL), body weight, and species (either cross or cranwelli). For a tidy data frame, we should have a column for each of these values. Your task is to load in the data, and then add these columns to the data frame. For convenience, here is a data frame with data about each frog.

[3]:

df_frog = pd.DataFrame(
    data={
        "ID": ["I", "II", "III", "IV"],
        "age": ["adult", "adult", "juvenile", "juvenile"],
        "SVL (mm)": [63, 70, 28, 31],
        "weight (g)": [63.1, 72.7, 12.7, 12.7],
        "species": ["cross", "cross", "cranwelli", "cranwelli"],
    }
)

Note: There are lots of ways to solve this problem. This is a good exercise in searching through the Pandas documentation and other online resources, such as Stack Overflow. Remember, much of your programming efforts are spent searching through documentation and the internet.

Solution¶

The most direct way is to use built-in pd.merge() function. This function finds a common column between two DataFrames, and then uses that column to merge them, filling in values that match in the common column. This is exactly what we want.

[4]:

# Load the data
df = pd.read_csv('data/frog_tongue_adhesion.csv', comment='#')

# Perform merge
df = df.merge(df_frog)

Let’s look at the DataFrame to make sure it has what we expect.

[5]:

df.head()

[5]:

	date	ID	trial number	impact force (mN)	impact time (ms)	impact force / body weight	adhesive force (mN)	time frog pulls on target (ms)	adhesive force / body weight	adhesive impulse (N-s)	total contact area (mm2)	contact area without mucus (mm2)	contact area with mucus / contact area without mucus	contact pressure (Pa)	adhesive strength (Pa)	age	SVL (mm)	weight (g)	species
0	2013_02_26	I	3	1205	46	1.95	-785	884	1.27	-0.290	387	70	0.82	3117	-2030	adult	63	63.1	cross
1	2013_02_26	I	4	2527	44	4.08	-983	248	1.59	-0.181	101	94	0.07	24923	-9695	adult	63	63.1	cross
2	2013_03_01	I	1	1745	34	2.82	-850	211	1.37	-0.157	83	79	0.05	21020	-10239	adult	63	63.1	cross
3	2013_03_01	I	2	1556	41	2.51	-455	1025	0.74	-0.170	330	158	0.52	4718	-1381	adult	63	63.1	cross
4	2013_03_01	I	3	493	36	0.80	-974	499	1.57	-0.423	245	216	0.12	2012	-3975	adult	63	63.1	cross

Computing environment¶

[6]:

%load_ext watermark
%watermark -v -p pandas,jupyterlab

CPython 3.7.7
IPython 7.13.0

pandas 0.24.2
jupyterlab 1.2.6