Exercise 3.3: Adding data to a data frame
[1]:
import polars as pl
We continue working with the frog tongue data. Recall that the header comments in the data file contained information about the frogs.
[2]:
!head -20 data/frog_tongue_adhesion.csv
# These data are from the paper,
# Kleinteich and Gorb, Sci. Rep., 4, 5225, 2014.
# It was featured in the New York Times.
# http://www.nytimes.com/2014/08/25/science/a-frog-thats-a-living-breathing-pac-man.html
#
# The authors included the data in their supplemental information.
#
# Importantly, the ID refers to the identifites of the frogs they tested.
# I: adult, 63 mm snout-vent-length (SVL) and 63.1 g body weight,
# Ceratophrys cranwelli crossed with Ceratophrys cornuta
# II: adult, 70 mm SVL and 72.7 g body weight,
# Ceratophrys cranwelli crossed with Ceratophrys cornuta
# III: juvenile, 28 mm SVL and 12.7 g body weight, Ceratophrys cranwelli
# IV: juvenile, 31 mm SVL and 12.7 g body weight, Ceratophrys cranwelli
date,ID,trial number,impact force (mN),impact time (ms),impact force / body weight,adhesive force (mN),time frog pulls on target (ms),adhesive force / body weight,adhesive impulse (N-s),total contact area (mm2),contact area without mucus (mm2),contact area with mucus / contact area without mucus,contact pressure (Pa),adhesive strength (Pa)
2013_02_26,I,3,1205,46,1.95,-785,884,1.27,-0.290,387,70,0.82,3117,-2030
2013_02_26,I,4,2527,44,4.08,-983,248,1.59,-0.181,101,94,0.07,24923,-9695
2013_03_01,I,1,1745,34,2.82,-850,211,1.37,-0.157,83,79,0.05,21020,-10239
2013_03_01,I,2,1556,41,2.51,-455,1025,0.74,-0.170,330,158,0.52,4718,-1381
2013_03_01,I,3,493,36,0.80,-974,499,1.57,-0.423,245,216,0.12,2012,-3975
So, each frog has associated with it an age (adult or juvenile), snout-vent-length (SVL), body weight, and species (either cross or cranwelli). For a tidy data frame, we should have a column for each of these values. Your task is to load in the data, and then add these columns to the data frame. For convenience, here is a data frame with data about each frog.
[3]:
df_frog = pl.DataFrame(
data={
"ID": ["I", "II", "III", "IV"],
"age": ["adult", "adult", "juvenile", "juvenile"],
"SVL (mm)": [63, 70, 28, 31],
"weight (g)": [63.1, 72.7, 12.7, 12.7],
"species": ["cross", "cross", "cranwelli", "cranwelli"],
}
)
Note: There are lots of ways to solve this problem. This is a good exercise in searching through the Polars documentation and other online resources. Until you have real mastery of a package, I encourage you read the documentation instead of asking a chatbot to do it.
Solution
The most direct way is to do a join. This function finds a common column between two DataFrames
, and then uses that column to join them, filling in values that match in the common column. This is exactly what we want.
[4]:
# Load the data
df = pl.read_csv('data/frog_tongue_adhesion.csv', comment_prefix='#')
# Perform merge
df = df.join(df_frog, on='ID')
Let’s look at the DataFrame
to make sure it has what we expect.
[5]:
df.head()
[5]:
date | ID | trial number | impact force (mN) | impact time (ms) | impact force / body weight | adhesive force (mN) | time frog pulls on target (ms) | adhesive force / body weight | adhesive impulse (N-s) | total contact area (mm2) | contact area without mucus (mm2) | contact area with mucus / contact area without mucus | contact pressure (Pa) | adhesive strength (Pa) | age | SVL (mm) | weight (g) | species |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
str | str | i64 | i64 | i64 | f64 | i64 | i64 | f64 | f64 | i64 | i64 | f64 | i64 | i64 | str | i64 | f64 | str |
"2013_02_26" | "I" | 3 | 1205 | 46 | 1.95 | -785 | 884 | 1.27 | -0.29 | 387 | 70 | 0.82 | 3117 | -2030 | "adult" | 63 | 63.1 | "cross" |
"2013_02_26" | "I" | 4 | 2527 | 44 | 4.08 | -983 | 248 | 1.59 | -0.181 | 101 | 94 | 0.07 | 24923 | -9695 | "adult" | 63 | 63.1 | "cross" |
"2013_03_01" | "I" | 1 | 1745 | 34 | 2.82 | -850 | 211 | 1.37 | -0.157 | 83 | 79 | 0.05 | 21020 | -10239 | "adult" | 63 | 63.1 | "cross" |
"2013_03_01" | "I" | 2 | 1556 | 41 | 2.51 | -455 | 1025 | 0.74 | -0.17 | 330 | 158 | 0.52 | 4718 | -1381 | "adult" | 63 | 63.1 | "cross" |
"2013_03_01" | "I" | 3 | 493 | 36 | 0.8 | -974 | 499 | 1.57 | -0.423 | 245 | 216 | 0.12 | 2012 | -3975 | "adult" | 63 | 63.1 | "cross" |
Computing environment
[6]:
%load_ext watermark
%watermark -v -p polars,jupyterlab
Python implementation: CPython
Python version : 3.13.5
IPython version : 9.4.0
polars : 1.31.0
jupyterlab: 4.4.5