(c) 2019 Justin Bois. With the exception of pasted graphics, where the source is noted, this work is licensed under a Creative Commons Attribution License CC-BY 4.0. All code contained herein is licensed under an MIT license.
This lesson was generated from a Jupyter notebook. You can download the notebook here.
import pandas as pd
Pandas can be a bit frustrating during your first experiences with it. In this lesson, we will practice using Pandas. The more and more you use it, the more distant the memory of life without it will become.
!head -20 data/frog_tongue_adhesion.csv
# These data are from the paper, # Kleinteich and Gorb, Sci. Rep., 4, 5225, 2014. # It was featured in the New York Times. # http://www.nytimes.com/2014/08/25/science/a-frog-thats-a-living-breathing-pac-man.html # # The authors included the data in their supplemental information. # # Importantly, the ID refers to the identifites of the frogs they tested. # I: adult, 63 mm snout-vent-length (SVL) and 63.1 g body weight, # Ceratophrys cranwelli crossed with Ceratophrys cornuta # II: adult, 70 mm SVL and 72.7 g body weight, # Ceratophrys cranwelli crossed with Ceratophrys cornuta # III: juvenile, 28 mm SVL and 12.7 g body weight, Ceratophrys cranwelli # IV: juvenile, 31 mm SVL and 12.7 g body weight, Ceratophrys cranwelli date,ID,trial number,impact force (mN),impact time (ms),impact force / body weight,adhesive force (mN),time frog pulls on target (ms),adhesive force / body weight,adhesive impulse (N-s),total contact area (mm2),contact area without mucus (mm2),contact area with mucus / contact area without mucus,contact pressure (Pa),adhesive strength (Pa) 2013_02_26,I,3,1205,46,1.95,-785,884,1.27,-0.290,387,70,0.82,3117,-2030 2013_02_26,I,4,2527,44,4.08,-983,248,1.59,-0.181,101,94,0.07,24923,-9695 2013_03_01,I,1,1745,34,2.82,-850,211,1.37,-0.157,83,79,0.05,21020,-10239 2013_03_01,I,2,1556,41,2.51,-455,1025,0.74,-0.170,330,158,0.52,4718,-1381 2013_03_01,I,3,493,36,0.80,-974,499,1.57,-0.423,245,216,0.12,2012,-3975
The first lines all begin with
# signs, signifying that they are comments and not data. They do give important information, though, such as the meaning of the ID data. The ID refers to which specific frog was tested.
Immediately after the comments, we have a row of comma-separated headers. This row sets the number of columns in this data set and labels the meaning of the columns. So, we see that the first column is the date of the experiment, the second column is the ID of the frog, the third is the trial number, and so on.
After this row, each row represents a single experiment where the frog struck the target. So, these data are already in tidy format. Let's go ahead and load the data into a
# Load the data df = pd.read_csv('data/frog_tongue_adhesion.csv', comment='#') # Take a look df.head()
|date||ID||trial number||impact force (mN)||impact time (ms)||impact force / body weight||adhesive force (mN)||time frog pulls on target (ms)||adhesive force / body weight||adhesive impulse (N-s)||total contact area (mm2)||contact area without mucus (mm2)||contact area with mucus / contact area without mucus||contact pressure (Pa)||adhesive strength (Pa)|
Your goal here is to extract certain entries out of the
a) Extract the impact time of all impacts that had an adhesive strength of magnitude greater than 2000 Pa. Note: The data in the
'adhesive strength (Pa)' column is all negative. This is because the adhesive force is defined to be negative in the measurement. Without changing the data in the data frame, how can you check that the magnitude (the absolute value) is greater than 2000?
b) Extract the impact force and adhesive force for all of Frog II's strikes.
c) Extract the adhesive force and the time the frog pulls on the target for juvenile frogs (Frogs III and IV). Hint: We saw the
& operator for Boolean indexing across more than one column. The
| operator signifies OR, and works analogously. For technical reasons that we can discuss if you like, the Python operators
or will not work for Boolean indexing of data frames. You could also approach this using the
isin() method of a Pandas
You'll now practice your split-apply-combine skills.
a) Compute standard deviation of the impact forces for each frog.
b) Compute the coefficient of variation of the impact forces and adhesive forces for each frog.
c) And now, finally.... Compute a
DataFrame that has the mean, median, standard deviation, and coefficient of variation of the impact forces and adhesive forces for each frog. After you make this
DataFrame, you might want to explore using the
pd.melt() function to make it tidy. You can read the documentation and/or ask a TA to help you.
%load_ext watermark %watermark -v -p pandas,jupyterlab
CPython 3.7.3 IPython 7.1.1 pandas 0.24.2 jupyterlab 0.35.5