Exercise 5.3: Adding data to a data frame


[1]:
import pandas as pd

We continue working with the frog tongue data. Recall that the header comments in the data file contained information about the frogs.

[2]:
!head -20 data/frog_tongue_adhesion.csv
# These data are from the paper,
#   Kleinteich and Gorb, Sci. Rep., 4, 5225, 2014.
# It was featured in the New York Times.
#    http://www.nytimes.com/2014/08/25/science/a-frog-thats-a-living-breathing-pac-man.html
#
# The authors included the data in their supplemental information.
#
# Importantly, the ID refers to the identifites of the frogs they tested.
#   I:   adult, 63 mm snout-vent-length (SVL) and 63.1 g body weight,
#        Ceratophrys cranwelli crossed with Ceratophrys cornuta
#   II:  adult, 70 mm SVL and 72.7 g body weight,
#        Ceratophrys cranwelli crossed with Ceratophrys cornuta
#   III: juvenile, 28 mm SVL and 12.7 g body weight, Ceratophrys cranwelli
#   IV:  juvenile, 31 mm SVL and 12.7 g body weight, Ceratophrys cranwelli
date,ID,trial number,impact force (mN),impact time (ms),impact force / body weight,adhesive force (mN),time frog pulls on target (ms),adhesive force / body weight,adhesive impulse (N-s),total contact area (mm2),contact area without mucus (mm2),contact area with mucus / contact area without mucus,contact pressure (Pa),adhesive strength (Pa)
2013_02_26,I,3,1205,46,1.95,-785,884,1.27,-0.290,387,70,0.82,3117,-2030
2013_02_26,I,4,2527,44,4.08,-983,248,1.59,-0.181,101,94,0.07,24923,-9695
2013_03_01,I,1,1745,34,2.82,-850,211,1.37,-0.157,83,79,0.05,21020,-10239
2013_03_01,I,2,1556,41,2.51,-455,1025,0.74,-0.170,330,158,0.52,4718,-1381
2013_03_01,I,3,493,36,0.80,-974,499,1.57,-0.423,245,216,0.12,2012,-3975

So, each frog has associated with it an age (adult or juvenile), snout-vent-length (SVL), body weight, and species (either cross or cranwelli). For a tidy data frame, we should have a column for each of these values. Your task is to load in the data, and then add these columns to the data frame. For convenience, here is a data frame with data about each frog.

[3]:
df_frog = pd.DataFrame(
    data={
        "ID": ["I", "II", "III", "IV"],
        "age": ["adult", "adult", "juvenile", "juvenile"],
        "SVL (mm)": [63, 70, 28, 31],
        "weight (g)": [63.1, 72.7, 12.7, 12.7],
        "species": ["cross", "cross", "cranwelli", "cranwelli"],
    }
)

Note: There are lots of ways to solve this problem. This is a good exercise in searching through the Pandas documentation and other online resources, such as Stack Overflow. Remember, much of your programming efforts are spent searching through documentation and the internet.

Solution

The most direct way is to use built-in pd.merge() function. This function finds a common column between two DataFrames, and then uses that column to merge them, filling in values that match in the common column. This is exactly what we want.

[4]:
# Load the data
df = pd.read_csv('data/frog_tongue_adhesion.csv', comment='#')

# Perform merge
df = df.merge(df_frog)

Let’s look at the DataFrame to make sure it has what we expect.

[5]:
df.head()
[5]:
date ID trial number impact force (mN) impact time (ms) impact force / body weight adhesive force (mN) time frog pulls on target (ms) adhesive force / body weight adhesive impulse (N-s) total contact area (mm2) contact area without mucus (mm2) contact area with mucus / contact area without mucus contact pressure (Pa) adhesive strength (Pa) age SVL (mm) weight (g) species
0 2013_02_26 I 3 1205 46 1.95 -785 884 1.27 -0.290 387 70 0.82 3117 -2030 adult 63 63.1 cross
1 2013_02_26 I 4 2527 44 4.08 -983 248 1.59 -0.181 101 94 0.07 24923 -9695 adult 63 63.1 cross
2 2013_03_01 I 1 1745 34 2.82 -850 211 1.37 -0.157 83 79 0.05 21020 -10239 adult 63 63.1 cross
3 2013_03_01 I 2 1556 41 2.51 -455 1025 0.74 -0.170 330 158 0.52 4718 -1381 adult 63 63.1 cross
4 2013_03_01 I 3 493 36 0.80 -974 499 1.57 -0.423 245 216 0.12 2012 -3975 adult 63 63.1 cross

Computing environment

[6]:
%load_ext watermark
%watermark -v -p pandas,jupyterlab
CPython 3.7.7
IPython 7.13.0

pandas 0.24.2
jupyterlab 1.2.6