Exercise 8.2: Hacker stats with bee sperm data


Neonicotinoid pesticides are thought to have inadvertent effects on service-providing insects such as bees. A study of this was featured in the New York Times in 2016. The original paper is Straub, et al., Proc. Royal Soc. B 283(1835): 20160506. Straub and coworkers put their data in the Dryad repository, which means we can work with it!

(Do you see a trend here? If you want people to think deeply about your results, explore them, learn from them, further science with them, make your data publicly available. Strongly encourage the members of your lab to do the same.)

We will look at the weight of drones (male bees) using the data set stored in ~/git/bootcamp/data/bee_weight.csv and the sperm quality of drone bees using the data set stored in ~/git/bootcamp/data/bee_sperm.csv.

a) Load the drone weight data in as a Pandas DataFrame. Note that the unit of the weight is milligrams (mg).

b) Plot ECDFs of the drone weight for control and also for those exposed to pesticide. Do you think there is a clear difference?

c) Compute the mean drone weight for control and those exposed to pesticide. Compute 95% bootstrap confidence intervals on the mean.

d) Repeat parts (a)-(c) for drone sperm. Use the 'Quality' column as your measure. This is defined as the percent of sperm that are alive in a 500 µL sample.

e) As you have seen in your analysis in part (d), both the control and pesticide treatments have some outliers with very low sperm quality. This can tug heavily on the mean. So, get 95% bootstrap confidence intervals for the median sperm quality of the two treatments.