{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lesson 30: Hacker statistics\n", "\n", "(c) 2018 Justin Bois. With the exception of pasted graphics, where the source is noted, this work is licensed under a [Creative Commons Attribution License CC-BY 4.0](https://creativecommons.org/licenses/by/4.0/). All code contained herein is licensed under an [MIT license](https://opensource.org/licenses/MIT).\n", "\n", "This document was prepared at [Caltech](http://www.caltech.edu) with financial support from the [Donna and Benjamin M. Rosen Bioengineering Center](http://rosen.caltech.edu).\n", "\n", "\n", "\n", "*This tutorial was generated from a Jupyter notebook. You can download the notebook [here](l30_hackerstats.ipynb).*\n", "\n", "

" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", "import altair as alt\n", "\n", "import bootcamp_utils" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "When the field of statistics was in its early days, the practitioners did not have computers. They were therefore left to use pen and paper to compute things like confidence intervals. With just a little bit of programming experience, you can perform lots of the statistical analyses that may seem baffling when done with pencil and paper.\n", "\n", "At the heart of this \"hacker statistics\" is the ability to draw random numbers. We will focus on **bootstrap** methods in particular.\n", "\n", "To motivate this study, we will work with data measured by Peter and Rosemary Grant on the island of Daphne Major on the Galápagos. They have been going to the island every year for over forty years and have been taking a careful inventory of the finches there. We will look at the finch *Geospiza scandens*. The Grants measured the depths of the beaks (defined as the top-to-bottom thickness of the beak) of all of the finches of this species on the island. We will consider their measurements from 1975 and from 2012. We will investigate how the beaks got deeper over time.\n", "\n", "The data are from the book Grants' book *40 years of evolution: Darwin's finches on Daphne Major Island*](http://www.worldcat.org/oclc/854285415). They were generous and made their data publicly available on the [Dryad data repository](http://dx.doi.org/10.5061/dryad.g6g3h). In general, it is a very good idea to put your published data in public data repositories, both to preserve the data and also to make your findings public.\n", "\n", "Ok, let's start by loading in the data. You converted the Grants' data into a single DataFrame in [exercise 3](../l23_exercise_3_solution). Let's load the data, which are available in the file `~git/data/grant_complete.csv`." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
bandbeak depth (mm)beak length (mm)speciesyear
0201238.059.25fortis1973
12012610.4511.35fortis1973
2201289.5510.15fortis1973
3201298.759.95fortis1973
42013310.1511.55fortis1973
\n", "
" ], "text/plain": [ " band beak depth (mm) beak length (mm) species year\n", "0 20123 8.05 9.25 fortis 1973\n", "1 20126 10.45 11.35 fortis 1973\n", "2 20128 9.55 10.15 fortis 1973\n", "3 20129 8.75 9.95 fortis 1973\n", "4 20133 10.15 11.55 fortis 1973" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv('data/grant_complete.csv', comment='#')\n", "\n", "df.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's trim down the `DataFrame` to only include *G. scandens* from 1975 and 2012 and only include the columns we need." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "df = df.loc[(df['species']=='scandens') & (df['year'].isin([1975, 2012])),\n", " ['year', 'beak depth (mm)']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a look at the ECDFs for these two years." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "application/vnd.vegalite.v2+json": { "$schema": "https://vega.github.io/schema/vega-lite/v2.4.3.json", "config": { "view": { "height": 300, "width": 400 } }, "data": { "values": [ { "ECDF": 0.16091954022988506, "beak depth (mm)": 8.4, "year": 1975 }, { "ECDF": 0.39080459770114945, "beak depth (mm)": 8.8, "year": 1975 }, { "ECDF": 0.1724137931034483, "beak depth (mm)": 8.4, "year": 1975 }, { "ECDF": 0.022988505747126436, "beak depth (mm)": 8, "year": 1975 }, { "ECDF": 0.011494252873563218, "beak depth (mm)": 7.9, "year": 1975 }, { "ECDF": 0.4482758620689655, "beak depth (mm)": 8.9, "year": 1975 }, { "ECDF": 0.26436781609195403, "beak depth (mm)": 8.6, "year": 1975 }, { "ECDF": 0.20689655172413793, "beak depth (mm)": 8.5, "year": 1975 }, { "ECDF": 0.45977011494252873, "beak depth (mm)": 8.9, "year": 1975 }, { "ECDF": 0.6091954022988506, "beak depth (mm)": 9.1, "year": 1975 }, { "ECDF": 0.27586206896551724, "beak depth (mm)": 8.6, "year": 1975 }, { "ECDF": 0.9195402298850575, "beak depth (mm)": 9.8, "year": 1975 }, { "ECDF": 0.06896551724137931, "beak depth (mm)": 8.2, "year": 1975 }, { "ECDF": 0.5057471264367817, "beak depth (mm)": 9, "year": 1975 }, { "ECDF": 0.8850574712643678, "beak depth (mm)": 9.7, "year": 1975 }, { "ECDF": 0.28735632183908044, "beak depth (mm)": 8.6, "year": 1975 }, { "ECDF": 0.08045977011494253, "beak depth (mm)": 8.2, "year": 1975 }, { "ECDF": 0.5172413793103449, "beak depth (mm)": 9, "year": 1975 }, { "ECDF": 0.1839080459770115, "beak depth (mm)": 8.4, "year": 1975 }, { "ECDF": 0.2988505747126437, "beak depth (mm)": 8.6, "year": 1975 }, { "ECDF": 0.47126436781609193, "beak depth (mm)": 8.9, "year": 1975 }, { "ECDF": 0.6206896551724138, "beak depth (mm)": 9.1, "year": 1975 }, { "ECDF": 0.10344827586206896, "beak depth (mm)": 8.3, "year": 1975 }, { "ECDF": 0.3448275862068966, "beak depth (mm)": 8.7, "year": 1975 }, { "ECDF": 0.8505747126436781, "beak depth (mm)": 9.6, "year": 1975 }, { "ECDF": 0.21839080459770116, "beak depth (mm)": 8.5, "year": 1975 }, { "ECDF": 0.632183908045977, "beak depth (mm)": 9.1, "year": 1975 }, { "ECDF": 0.5287356321839081, "beak depth (mm)": 9, "year": 1975 }, { "ECDF": 0.7126436781609196, "beak depth (mm)": 9.2, "year": 1975 }, { "ECDF": 0.9425287356321839, "beak depth (mm)": 9.9, "year": 1975 }, { "ECDF": 0.3103448275862069, "beak depth (mm)": 8.6, "year": 1975 }, { "ECDF": 0.7241379310344828, "beak depth (mm)": 9.2, "year": 1975 }, { "ECDF": 0.19540229885057472, "beak depth (mm)": 8.4, "year": 1975 }, { "ECDF": 0.4827586206896552, "beak depth (mm)": 8.9, "year": 1975 }, { "ECDF": 0.22988505747126436, "beak depth (mm)": 8.5, "year": 1975 }, { "ECDF": 0.9885057471264368, "beak depth (mm)": 10.4, "year": 1975 }, { "ECDF": 0.8620689655172413, "beak depth (mm)": 9.6, "year": 1975 }, { "ECDF": 0.6436781609195402, "beak depth (mm)": 9.1, "year": 1975 }, { "ECDF": 0.7586206896551724, "beak depth (mm)": 9.3, "year": 1975 }, { "ECDF": 0.7701149425287356, "beak depth (mm)": 9.3, "year": 1975 }, { "ECDF": 0.40229885057471265, "beak depth (mm)": 8.8, "year": 1975 }, { "ECDF": 0.11494252873563218, "beak depth (mm)": 8.3, "year": 1975 }, { "ECDF": 0.41379310344827586, "beak depth (mm)": 8.8, "year": 1975 }, { "ECDF": 0.6551724137931034, "beak depth (mm)": 9.1, "year": 1975 }, { "ECDF": 0.9540229885057471, "beak depth (mm)": 10.1, "year": 1975 }, { "ECDF": 0.4942528735632184, "beak depth (mm)": 8.9, "year": 1975 }, { "ECDF": 0.735632183908046, "beak depth (mm)": 9.2, "year": 1975 }, { "ECDF": 0.2413793103448276, "beak depth (mm)": 8.5, "year": 1975 }, { "ECDF": 0.9770114942528736, "beak depth (mm)": 10.2, "year": 1975 }, { "ECDF": 0.9655172413793104, "beak depth (mm)": 10.1, "year": 1975 }, { "ECDF": 0.7471264367816092, "beak depth (mm)": 9.2, "year": 1975 }, { "ECDF": 0.896551724137931, "beak depth (mm)": 9.7, "year": 1975 }, { "ECDF": 0.6666666666666666, "beak depth (mm)": 9.1, "year": 1975 }, { "ECDF": 0.25287356321839083, "beak depth (mm)": 8.5, "year": 1975 }, { "ECDF": 0.09195402298850575, "beak depth (mm)": 8.2, "year": 1975 }, { "ECDF": 0.5402298850574713, "beak depth (mm)": 9, "year": 1975 }, { "ECDF": 0.7816091954022989, "beak depth (mm)": 9.3, "year": 1975 }, { "ECDF": 0.034482758620689655, "beak depth (mm)": 8, "year": 1975 }, { "ECDF": 0.6781609195402298, "beak depth (mm)": 9.1, "year": 1975 }, { "ECDF": 0.05747126436781609, "beak depth (mm)": 8.1, "year": 1975 }, { "ECDF": 0.12643678160919541, "beak depth (mm)": 8.3, "year": 1975 }, { "ECDF": 0.3563218390804598, "beak depth (mm)": 8.7, "year": 1975 }, { "ECDF": 0.42528735632183906, "beak depth (mm)": 8.8, "year": 1975 }, { "ECDF": 0.3218390804597701, "beak depth (mm)": 8.6, "year": 1975 }, { "ECDF": 0.367816091954023, "beak depth (mm)": 8.7, "year": 1975 }, { "ECDF": 0.04597701149425287, "beak depth (mm)": 8, "year": 1975 }, { "ECDF": 0.4367816091954023, "beak depth (mm)": 8.8, "year": 1975 }, { "ECDF": 0.5517241379310345, "beak depth (mm)": 9, "year": 1975 }, { "ECDF": 0.6896551724137931, "beak depth (mm)": 9.1, "year": 1975 }, { "ECDF": 0.9080459770114943, "beak depth (mm)": 9.74, "year": 1975 }, { "ECDF": 0.7011494252873564, "beak depth (mm)": 9.1, "year": 1975 }, { "ECDF": 0.9310344827586207, "beak depth (mm)": 9.8, "year": 1975 }, { "ECDF": 1, "beak depth (mm)": 10.4, "year": 1975 }, { "ECDF": 0.13793103448275862, "beak depth (mm)": 8.3, "year": 1975 }, { "ECDF": 0.7931034482758621, "beak depth (mm)": 9.44, "year": 1975 }, { "ECDF": 0.5747126436781609, "beak depth (mm)": 9.04, "year": 1975 }, { "ECDF": 0.5632183908045977, "beak depth (mm)": 9, "year": 1975 }, { "ECDF": 0.5862068965517241, "beak depth (mm)": 9.05, "year": 1975 }, { "ECDF": 0.8735632183908046, "beak depth (mm)": 9.65, "year": 1975 }, { "ECDF": 0.8045977011494253, "beak depth (mm)": 9.45, "year": 1975 }, { "ECDF": 0.3333333333333333, "beak depth (mm)": 8.65, "year": 1975 }, { "ECDF": 0.8160919540229885, "beak depth (mm)": 9.45, "year": 1975 }, { "ECDF": 0.8275862068965517, "beak depth (mm)": 9.45, "year": 1975 }, { "ECDF": 0.5977011494252874, "beak depth (mm)": 9.05, "year": 1975 }, { "ECDF": 0.3793103448275862, "beak depth (mm)": 8.75, "year": 1975 }, { "ECDF": 0.8390804597701149, "beak depth (mm)": 9.45, "year": 1975 }, { "ECDF": 0.14942528735632185, "beak depth (mm)": 8.35, "year": 1975 }, { "ECDF": 0.5952380952380952, "beak depth (mm)": 9.4, "year": 2012 }, { "ECDF": 0.6587301587301587, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 1, "beak depth (mm)": 11, "year": 2012 }, { "ECDF": 0.20634920634920634, "beak depth (mm)": 8.7, "year": 2012 }, { "ECDF": 0.1111111111111111, "beak depth (mm)": 8.4, "year": 2012 }, { "ECDF": 0.4365079365079365, "beak depth (mm)": 9.1, "year": 2012 }, { "ECDF": 0.21428571428571427, "beak depth (mm)": 8.7, "year": 2012 }, { "ECDF": 0.9206349206349206, "beak depth (mm)": 10.2, "year": 2012 }, { "ECDF": 0.746031746031746, "beak depth (mm)": 9.6, "year": 2012 }, { "ECDF": 0.2857142857142857, "beak depth (mm)": 8.85, "year": 2012 }, { "ECDF": 0.24603174603174602, "beak depth (mm)": 8.8, "year": 2012 }, { "ECDF": 0.6666666666666666, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 0.49206349206349204, "beak depth (mm)": 9.2, "year": 2012 }, { "ECDF": 0.38095238095238093, "beak depth (mm)": 9, "year": 2012 }, { "ECDF": 0.8015873015873016, "beak depth (mm)": 9.8, "year": 2012 }, { "ECDF": 0.5238095238095238, "beak depth (mm)": 9.3, "year": 2012 }, { "ECDF": 0.3888888888888889, "beak depth (mm)": 9, "year": 2012 }, { "ECDF": 0.9285714285714286, "beak depth (mm)": 10.2, "year": 2012 }, { "ECDF": 0.007936507936507936, "beak depth (mm)": 7.7, "year": 2012 }, { "ECDF": 0.3968253968253968, "beak depth (mm)": 9, "year": 2012 }, { "ECDF": 0.6746031746031746, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 0.6031746031746031, "beak depth (mm)": 9.4, "year": 2012 }, { "ECDF": 0.03968253968253968, "beak depth (mm)": 8, "year": 2012 }, { "ECDF": 0.29365079365079366, "beak depth (mm)": 8.9, "year": 2012 }, { "ECDF": 0.6111111111111112, "beak depth (mm)": 9.4, "year": 2012 }, { "ECDF": 0.6825396825396826, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 0.047619047619047616, "beak depth (mm)": 8, "year": 2012 }, { "ECDF": 0.8809523809523809, "beak depth (mm)": 10, "year": 2012 }, { "ECDF": 0.373015873015873, "beak depth (mm)": 8.95, "year": 2012 }, { "ECDF": 0.07936507936507936, "beak depth (mm)": 8.2, "year": 2012 }, { "ECDF": 0.25396825396825395, "beak depth (mm)": 8.8, "year": 2012 }, { "ECDF": 0.5, "beak depth (mm)": 9.2, "year": 2012 }, { "ECDF": 0.6190476190476191, "beak depth (mm)": 9.4, "year": 2012 }, { "ECDF": 0.6904761904761905, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 0.06349206349206349, "beak depth (mm)": 8.1, "year": 2012 }, { "ECDF": 0.6984126984126984, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 0.11904761904761904, "beak depth (mm)": 8.4, "year": 2012 }, { "ECDF": 0.5317460317460317, "beak depth (mm)": 9.3, "year": 2012 }, { "ECDF": 0.5396825396825397, "beak depth (mm)": 9.3, "year": 2012 }, { "ECDF": 0.753968253968254, "beak depth (mm)": 9.6, "year": 2012 }, { "ECDF": 0.5079365079365079, "beak depth (mm)": 9.2, "year": 2012 }, { "ECDF": 0.8888888888888888, "beak depth (mm)": 10, "year": 2012 }, { "ECDF": 0.30158730158730157, "beak depth (mm)": 8.9, "year": 2012 }, { "ECDF": 0.9603174603174603, "beak depth (mm)": 10.5, "year": 2012 }, { "ECDF": 0.30952380952380953, "beak depth (mm)": 8.9, "year": 2012 }, { "ECDF": 0.19047619047619047, "beak depth (mm)": 8.6, "year": 2012 }, { "ECDF": 0.2619047619047619, "beak depth (mm)": 8.8, "year": 2012 }, { "ECDF": 0.48412698412698413, "beak depth (mm)": 9.15, "year": 2012 }, { "ECDF": 0.7063492063492064, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 0.4444444444444444, "beak depth (mm)": 9.1, "year": 2012 }, { "ECDF": 0.9365079365079365, "beak depth (mm)": 10.2, "year": 2012 }, { "ECDF": 0.12698412698412698, "beak depth (mm)": 8.4, "year": 2012 }, { "ECDF": 0.8968253968253969, "beak depth (mm)": 10, "year": 2012 }, { "ECDF": 0.9444444444444444, "beak depth (mm)": 10.2, "year": 2012 }, { "ECDF": 0.5476190476190477, "beak depth (mm)": 9.3, "year": 2012 }, { "ECDF": 0.9920634920634921, "beak depth (mm)": 10.8, "year": 2012 }, { "ECDF": 0.09523809523809523, "beak depth (mm)": 8.3, "year": 2012 }, { "ECDF": 0.023809523809523808, "beak depth (mm)": 7.8, "year": 2012 }, { "ECDF": 0.8095238095238095, "beak depth (mm)": 9.8, "year": 2012 }, { "ECDF": 0.031746031746031744, "beak depth (mm)": 7.9, "year": 2012 }, { "ECDF": 0.31746031746031744, "beak depth (mm)": 8.9, "year": 2012 }, { "ECDF": 0.015873015873015872, "beak depth (mm)": 7.7, "year": 2012 }, { "ECDF": 0.3253968253968254, "beak depth (mm)": 8.9, "year": 2012 }, { "ECDF": 0.626984126984127, "beak depth (mm)": 9.4, "year": 2012 }, { "ECDF": 0.6349206349206349, "beak depth (mm)": 9.4, "year": 2012 }, { "ECDF": 0.15873015873015872, "beak depth (mm)": 8.5, "year": 2012 }, { "ECDF": 0.16666666666666666, "beak depth (mm)": 8.5, "year": 2012 }, { "ECDF": 0.7619047619047619, "beak depth (mm)": 9.6, "year": 2012 }, { "ECDF": 0.9523809523809523, "beak depth (mm)": 10.2, "year": 2012 }, { "ECDF": 0.2698412698412698, "beak depth (mm)": 8.8, "year": 2012 }, { "ECDF": 0.7142857142857143, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 0.5555555555555556, "beak depth (mm)": 9.3, "year": 2012 }, { "ECDF": 0.40476190476190477, "beak depth (mm)": 9, "year": 2012 }, { "ECDF": 0.5158730158730159, "beak depth (mm)": 9.2, "year": 2012 }, { "ECDF": 0.2222222222222222, "beak depth (mm)": 8.7, "year": 2012 }, { "ECDF": 0.4126984126984127, "beak depth (mm)": 9, "year": 2012 }, { "ECDF": 0.4523809523809524, "beak depth (mm)": 9.1, "year": 2012 }, { "ECDF": 0.23015873015873015, "beak depth (mm)": 8.7, "year": 2012 }, { "ECDF": 0.6428571428571429, "beak depth (mm)": 9.4, "year": 2012 }, { "ECDF": 0.8174603174603174, "beak depth (mm)": 9.8, "year": 2012 }, { "ECDF": 0.1984126984126984, "beak depth (mm)": 8.6, "year": 2012 }, { "ECDF": 0.9761904761904762, "beak depth (mm)": 10.6, "year": 2012 }, { "ECDF": 0.42063492063492064, "beak depth (mm)": 9, "year": 2012 }, { "ECDF": 0.7222222222222222, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 0.07142857142857142, "beak depth (mm)": 8.1, "year": 2012 }, { "ECDF": 0.5634920634920635, "beak depth (mm)": 9.3, "year": 2012 }, { "ECDF": 0.7698412698412699, "beak depth (mm)": 9.6, "year": 2012 }, { "ECDF": 0.1746031746031746, "beak depth (mm)": 8.5, "year": 2012 }, { "ECDF": 0.0873015873015873, "beak depth (mm)": 8.2, "year": 2012 }, { "ECDF": 0.05555555555555555, "beak depth (mm)": 8, "year": 2012 }, { "ECDF": 0.7301587301587301, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 0.7857142857142857, "beak depth (mm)": 9.7, "year": 2012 }, { "ECDF": 0.8650793650793651, "beak depth (mm)": 9.9, "year": 2012 }, { "ECDF": 0.4603174603174603, "beak depth (mm)": 9.1, "year": 2012 }, { "ECDF": 0.7380952380952381, "beak depth (mm)": 9.5, "year": 2012 }, { "ECDF": 0.8253968253968254, "beak depth (mm)": 9.8, "year": 2012 }, { "ECDF": 0.1349206349206349, "beak depth (mm)": 8.4, "year": 2012 }, { "ECDF": 0.10317460317460317, "beak depth (mm)": 8.3, "year": 2012 }, { "ECDF": 0.7777777777777778, "beak depth (mm)": 9.6, "year": 2012 }, { "ECDF": 0.6507936507936508, "beak depth (mm)": 9.4, "year": 2012 }, { "ECDF": 0.9047619047619048, "beak depth (mm)": 10, "year": 2012 }, { "ECDF": 0.3333333333333333, "beak depth (mm)": 8.9, "year": 2012 }, { "ECDF": 0.46825396825396826, "beak depth (mm)": 9.1, "year": 2012 }, { "ECDF": 0.8333333333333334, "beak depth (mm)": 9.8, "year": 2012 }, { "ECDF": 0.5714285714285714, "beak depth (mm)": 9.3, "year": 2012 }, { "ECDF": 0.873015873015873, "beak depth (mm)": 9.9, "year": 2012 }, { "ECDF": 0.3412698412698413, "beak depth (mm)": 8.9, "year": 2012 }, { "ECDF": 0.18253968253968253, "beak depth (mm)": 8.5, "year": 2012 }, { "ECDF": 0.9841269841269841, "beak depth (mm)": 10.6, "year": 2012 }, { "ECDF": 0.5793650793650794, "beak depth (mm)": 9.3, "year": 2012 }, { "ECDF": 0.3492063492063492, "beak depth (mm)": 8.9, "year": 2012 }, { "ECDF": 0.35714285714285715, "beak depth (mm)": 8.9, "year": 2012 }, { "ECDF": 0.7936507936507936, "beak depth (mm)": 9.7, "year": 2012 }, { "ECDF": 0.8412698412698413, "beak depth (mm)": 9.8, "year": 2012 }, { "ECDF": 0.9682539682539683, "beak depth (mm)": 10.5, "year": 2012 }, { "ECDF": 0.14285714285714285, "beak depth (mm)": 8.4, "year": 2012 }, { "ECDF": 0.9126984126984127, "beak depth (mm)": 10, "year": 2012 }, { "ECDF": 0.42857142857142855, "beak depth (mm)": 9, "year": 2012 }, { "ECDF": 0.23809523809523808, "beak depth (mm)": 8.7, "year": 2012 }, { "ECDF": 0.2777777777777778, "beak depth (mm)": 8.8, "year": 2012 }, { "ECDF": 0.15079365079365079, "beak depth (mm)": 8.4, "year": 2012 }, { "ECDF": 0.5873015873015873, "beak depth (mm)": 9.3, "year": 2012 }, { "ECDF": 0.8492063492063492, "beak depth (mm)": 9.8, "year": 2012 }, { "ECDF": 0.36507936507936506, "beak depth (mm)": 8.9, "year": 2012 }, { "ECDF": 0.8571428571428571, "beak depth (mm)": 9.8, "year": 2012 }, { "ECDF": 0.47619047619047616, "beak depth (mm)": 9.1, "year": 2012 } ] }, "encoding": { "color": { "field": "year", "type": "nominal" }, "x": { "field": "beak depth (mm)", "scale": { "zero": false }, "type": "quantitative" }, "y": { "field": "ECDF", "type": "quantitative" } }, "mark": "point" }, "image/png": "", "text/plain": [ "\n", "\n", "If you see this message, it means the renderer has not been properly enabled\n", "for the frontend that you are using. For more information, see\n", "https://altair-viz.github.io/user_guide/troubleshooting.html\n" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Compute ECDF\n", "df['ECDF'] = df.groupby('year').transform(bootcamp_utils.ecdf_y)\n", "\n", "# Make a plot\n", "alt.Chart(df\n", " ).mark_point(\n", " ).encode(\n", " x=alt.X('beak depth (mm):Q', \n", " scale=alt.Scale(zero=False)),\n", " y='ECDF:Q',\n", " color='year:N')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Judging from the ECDFs, it seems as though beaks have gotten deeper over time. But now, we would like a *statistic* to compare. One statistic that comes to mind it the mean. So, let's compare those. First, we'll pull out the data sets as Numpy arrays for convenience (and speed later on when we start doing bootstrap replicates)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(8.959999999999999, 9.188492063492063)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bd_1975 = df.loc[df['year']==1975, 'beak depth (mm)'].values\n", "bd_2012 = df.loc[df['year']==2012, 'beak depth (mm)'].values\n", "\n", "np.mean(bd_1975), np.mean(bd_2012)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, indeed, the mean beak depth is bigger in 2012 than in 1975. There is clearly some variability in beak depth among the birds measured each year, so it is possible that this observation is just due to random chance and the mean beak depth is really not that big. So, we would like to compute a *confidence interval* of the mean. We will compute the 95% confidence interval.\n", "\n", "What is a 95% confidence interval? It can be thought of as follows. If we were to repeat the experiment over and over and over again, 95% of the time, the observed mean would lie in the 95% confidence interval. So, if the confidence intervals of the means of measurements from 1975 and from 2012 overlapped, we might not be so sure that the beaks got deeper due to some underlying selective pressure, but that we just happened to *observe* deeper beaks as a result of natural variability.\n", "\n", "So, how do we compute a confidence interval? ....We use our computer!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bootstrap confidence intervals\n", "\n", "The notion of the bootstrap was first published by Brad Efron in 1979. The idea is simple, and we will take the fact that it works as a given; Efron proved it for us. \n", "\n", "Here's the idea: If we could somehow repeat the measurements of the beak depths on Daphne Major, we could do it many many times, and we could then just compute the 2.5th and 97.5th percentiles to get a 95% confidence interval. The problem is, we can't repeat the experiments over and over again. 1975 only happened once, and all birds on the island were measured. We cannot have 1975 happen again under exactly the same conditions. \n", "\n", "Instead, we will have our computer *simulate* doing the experiment over and over again. Hacker statistics! We have one set of measurements. We \"repeat\" the experiment by drawing measurements out of the ones we have again and again. Here's what we do to compute a bootstrap estimate of the mean of a set of $n$ data points.\n", ">1. Draw *n* data points out of the original data set *with replacement*. This set of data points is called a **bootstrap sample**.\n", "2. Compute the mean of the bootstrap sample. This is called a **bootstrap replicate** of the mean.\n", "3. Do this over and over again, storing the results.\n", "\n", "So, it is as if we did the experiment over and over again, obtaining a mean each time. Remember, our bootstrap sample has exactly the same number of \"measurements\" as the original data set. Let's try it with the `bd_1975` data (remember the mean beak depth was 8.96 mm). First we'll generate a bootstrap sample. Remember, the function `np.random.choice()` allows us to sample out of an array with replacement, if we like." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "np.random.seed(42)\n", "bs_sample = np.random.choice(bd_1975, replace=True, size=len(bd_1975))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's take a quick look at this bootstrap sample by plotting its ECDF right next to that of the original data set." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "application/vnd.vegalite.v2+json": { "$schema": "https://vega.github.io/schema/vega-lite/v2.4.3.json", "config": { "view": { "height": 300, "width": 400 } }, "data": { "values": [ { "ECDF": 0.011494252873563218, "beak depth (mm)": 8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.022988505747126436, "beak depth (mm)": 8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.034482758620689655, "beak depth (mm)": 8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.04597701149425287, "beak depth (mm)": 8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.05747126436781609, "beak depth (mm)": 8.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.06896551724137931, "beak depth (mm)": 8.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.08045977011494253, "beak depth (mm)": 8.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.09195402298850575, "beak depth (mm)": 8.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.10344827586206896, "beak depth (mm)": 8.2, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.11494252873563218, "beak depth (mm)": 8.3, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.12643678160919541, "beak depth (mm)": 8.3, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.13793103448275862, "beak depth (mm)": 8.3, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.14942528735632185, "beak depth (mm)": 8.35, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.16091954022988506, "beak depth (mm)": 8.35, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.1724137931034483, "beak depth (mm)": 8.4, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.1839080459770115, "beak depth (mm)": 8.4, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.19540229885057472, "beak depth (mm)": 8.4, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.20689655172413793, "beak depth (mm)": 8.5, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.21839080459770116, "beak depth (mm)": 8.5, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.22988505747126436, "beak depth (mm)": 8.5, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.2413793103448276, "beak depth (mm)": 8.5, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.25287356321839083, "beak depth (mm)": 8.5, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.26436781609195403, "beak depth (mm)": 8.5, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.27586206896551724, "beak depth (mm)": 8.6, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.28735632183908044, "beak depth (mm)": 8.6, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.2988505747126437, "beak depth (mm)": 8.6, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.3103448275862069, "beak depth (mm)": 8.65, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.3218390804597701, "beak depth (mm)": 8.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.3333333333333333, "beak depth (mm)": 8.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.3448275862068966, "beak depth (mm)": 8.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.3563218390804598, "beak depth (mm)": 8.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.367816091954023, "beak depth (mm)": 8.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.3793103448275862, "beak depth (mm)": 8.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.39080459770114945, "beak depth (mm)": 8.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.40229885057471265, "beak depth (mm)": 8.75, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.41379310344827586, "beak depth (mm)": 8.8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.42528735632183906, "beak depth (mm)": 8.8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.4367816091954023, "beak depth (mm)": 8.8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.4482758620689655, "beak depth (mm)": 8.8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.45977011494252873, "beak depth (mm)": 8.8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.47126436781609193, "beak depth (mm)": 8.9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.4827586206896552, "beak depth (mm)": 8.9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.4942528735632184, "beak depth (mm)": 8.9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.5057471264367817, "beak depth (mm)": 8.9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.5172413793103449, "beak depth (mm)": 8.9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.5287356321839081, "beak depth (mm)": 8.9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.5402298850574713, "beak depth (mm)": 9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.5517241379310345, "beak depth (mm)": 9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.5632183908045977, "beak depth (mm)": 9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.5747126436781609, "beak depth (mm)": 9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.5862068965517241, "beak depth (mm)": 9.04, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.5977011494252874, "beak depth (mm)": 9.05, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.6091954022988506, "beak depth (mm)": 9.05, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.6206896551724138, "beak depth (mm)": 9.05, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.632183908045977, "beak depth (mm)": 9.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.6436781609195402, "beak depth (mm)": 9.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.6551724137931034, "beak depth (mm)": 9.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.6666666666666666, "beak depth (mm)": 9.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.6781609195402298, "beak depth (mm)": 9.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.6896551724137931, "beak depth (mm)": 9.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.7011494252873564, "beak depth (mm)": 9.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.7126436781609196, "beak depth (mm)": 9.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.7241379310344828, "beak depth (mm)": 9.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.735632183908046, "beak depth (mm)": 9.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.7471264367816092, "beak depth (mm)": 9.2, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.7586206896551724, "beak depth (mm)": 9.2, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.7701149425287356, "beak depth (mm)": 9.2, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.7816091954022989, "beak depth (mm)": 9.2, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.7931034482758621, "beak depth (mm)": 9.3, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.8045977011494253, "beak depth (mm)": 9.3, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.8160919540229885, "beak depth (mm)": 9.44, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.8275862068965517, "beak depth (mm)": 9.44, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.8390804597701149, "beak depth (mm)": 9.45, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.8505747126436781, "beak depth (mm)": 9.45, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.8620689655172413, "beak depth (mm)": 9.45, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.8735632183908046, "beak depth (mm)": 9.45, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.8850574712643678, "beak depth (mm)": 9.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.896551724137931, "beak depth (mm)": 9.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.9080459770114943, "beak depth (mm)": 9.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.9195402298850575, "beak depth (mm)": 9.7, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.9310344827586207, "beak depth (mm)": 9.8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.9425287356321839, "beak depth (mm)": 9.8, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.9540229885057471, "beak depth (mm)": 9.9, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.9655172413793104, "beak depth (mm)": 10.1, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.9770114942528736, "beak depth (mm)": 10.2, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.9885057471264368, "beak depth (mm)": 10.4, "set": "bootstrap", "year": 1975 }, { "ECDF": 1, "beak depth (mm)": 10.4, "set": "bootstrap", "year": 1975 }, { "ECDF": 0.16091954022988506, "beak depth (mm)": 8.4, "set": 1975, "year": 1975 }, { "ECDF": 0.39080459770114945, "beak depth (mm)": 8.8, "set": 1975, "year": 1975 }, { "ECDF": 0.1724137931034483, "beak depth (mm)": 8.4, "set": 1975, "year": 1975 }, { "ECDF": 0.022988505747126436, "beak depth (mm)": 8, "set": 1975, "year": 1975 }, { "ECDF": 0.011494252873563218, "beak depth (mm)": 7.9, "set": 1975, "year": 1975 }, { "ECDF": 0.4482758620689655, "beak depth (mm)": 8.9, "set": 1975, "year": 1975 }, { "ECDF": 0.26436781609195403, "beak depth (mm)": 8.6, "set": 1975, "year": 1975 }, { "ECDF": 0.20689655172413793, "beak depth (mm)": 8.5, "set": 1975, "year": 1975 }, { "ECDF": 0.45977011494252873, "beak depth (mm)": 8.9, "set": 1975, "year": 1975 }, { "ECDF": 0.6091954022988506, "beak depth (mm)": 9.1, "set": 1975, "year": 1975 }, { "ECDF": 0.27586206896551724, "beak depth (mm)": 8.6, "set": 1975, "year": 1975 }, { "ECDF": 0.9195402298850575, "beak depth (mm)": 9.8, "set": 1975, "year": 1975 }, { "ECDF": 0.06896551724137931, "beak depth (mm)": 8.2, "set": 1975, "year": 1975 }, { "ECDF": 0.5057471264367817, "beak depth (mm)": 9, "set": 1975, "year": 1975 }, { "ECDF": 0.8850574712643678, "beak depth (mm)": 9.7, "set": 1975, "year": 1975 }, { "ECDF": 0.28735632183908044, "beak depth (mm)": 8.6, "set": 1975, "year": 1975 }, { "ECDF": 0.08045977011494253, "beak depth (mm)": 8.2, "set": 1975, "year": 1975 }, { "ECDF": 0.5172413793103449, "beak depth (mm)": 9, "set": 1975, "year": 1975 }, { "ECDF": 0.1839080459770115, "beak depth (mm)": 8.4, "set": 1975, "year": 1975 }, { "ECDF": 0.2988505747126437, "beak depth (mm)": 8.6, "set": 1975, "year": 1975 }, { "ECDF": 0.47126436781609193, "beak depth (mm)": 8.9, "set": 1975, "year": 1975 }, { "ECDF": 0.6206896551724138, "beak depth (mm)": 9.1, "set": 1975, "year": 1975 }, { "ECDF": 0.10344827586206896, "beak depth (mm)": 8.3, "set": 1975, "year": 1975 }, { "ECDF": 0.3448275862068966, "beak depth (mm)": 8.7, "set": 1975, "year": 1975 }, { "ECDF": 0.8505747126436781, "beak depth (mm)": 9.6, "set": 1975, "year": 1975 }, { "ECDF": 0.21839080459770116, "beak depth (mm)": 8.5, "set": 1975, "year": 1975 }, { "ECDF": 0.632183908045977, "beak depth (mm)": 9.1, "set": 1975, "year": 1975 }, { "ECDF": 0.5287356321839081, "beak depth (mm)": 9, "set": 1975, "year": 1975 }, { "ECDF": 0.7126436781609196, "beak depth (mm)": 9.2, "set": 1975, "year": 1975 }, { "ECDF": 0.9425287356321839, "beak depth (mm)": 9.9, "set": 1975, "year": 1975 }, { "ECDF": 0.3103448275862069, "beak depth (mm)": 8.6, "set": 1975, "year": 1975 }, { "ECDF": 0.7241379310344828, "beak depth (mm)": 9.2, "set": 1975, "year": 1975 }, { "ECDF": 0.19540229885057472, "beak depth (mm)": 8.4, "set": 1975, "year": 1975 }, { "ECDF": 0.4827586206896552, "beak depth (mm)": 8.9, "set": 1975, "year": 1975 }, { "ECDF": 0.22988505747126436, "beak depth (mm)": 8.5, "set": 1975, "year": 1975 }, { "ECDF": 0.9885057471264368, "beak depth (mm)": 10.4, "set": 1975, "year": 1975 }, { "ECDF": 0.8620689655172413, "beak depth (mm)": 9.6, "set": 1975, "year": 1975 }, { "ECDF": 0.6436781609195402, "beak depth (mm)": 9.1, "set": 1975, "year": 1975 }, { "ECDF": 0.7586206896551724, "beak depth (mm)": 9.3, "set": 1975, "year": 1975 }, { "ECDF": 0.7701149425287356, "beak depth (mm)": 9.3, "set": 1975, "year": 1975 }, { "ECDF": 0.40229885057471265, "beak depth (mm)": 8.8, "set": 1975, "year": 1975 }, { "ECDF": 0.11494252873563218, "beak depth (mm)": 8.3, "set": 1975, "year": 1975 }, { "ECDF": 0.41379310344827586, "beak depth (mm)": 8.8, "set": 1975, "year": 1975 }, { "ECDF": 0.6551724137931034, "beak depth (mm)": 9.1, "set": 1975, "year": 1975 }, { "ECDF": 0.9540229885057471, "beak depth (mm)": 10.1, "set": 1975, "year": 1975 }, { "ECDF": 0.4942528735632184, "beak depth (mm)": 8.9, "set": 1975, "year": 1975 }, { "ECDF": 0.735632183908046, "beak depth (mm)": 9.2, "set": 1975, "year": 1975 }, { "ECDF": 0.2413793103448276, "beak depth (mm)": 8.5, "set": 1975, "year": 1975 }, { "ECDF": 0.9770114942528736, "beak depth (mm)": 10.2, "set": 1975, "year": 1975 }, { "ECDF": 0.9655172413793104, "beak depth (mm)": 10.1, "set": 1975, "year": 1975 }, { "ECDF": 0.7471264367816092, "beak depth (mm)": 9.2, "set": 1975, "year": 1975 }, { "ECDF": 0.896551724137931, "beak depth (mm)": 9.7, "set": 1975, "year": 1975 }, { "ECDF": 0.6666666666666666, "beak depth (mm)": 9.1, "set": 1975, "year": 1975 }, { "ECDF": 0.25287356321839083, "beak depth (mm)": 8.5, "set": 1975, "year": 1975 }, { "ECDF": 0.09195402298850575, "beak depth (mm)": 8.2, "set": 1975, "year": 1975 }, { "ECDF": 0.5402298850574713, "beak depth (mm)": 9, "set": 1975, "year": 1975 }, { "ECDF": 0.7816091954022989, "beak depth (mm)": 9.3, "set": 1975, "year": 1975 }, { "ECDF": 0.034482758620689655, "beak depth (mm)": 8, "set": 1975, "year": 1975 }, { "ECDF": 0.6781609195402298, "beak depth (mm)": 9.1, "set": 1975, "year": 1975 }, { "ECDF": 0.05747126436781609, "beak depth (mm)": 8.1, "set": 1975, "year": 1975 }, { "ECDF": 0.12643678160919541, "beak depth (mm)": 8.3, "set": 1975, "year": 1975 }, { "ECDF": 0.3563218390804598, "beak depth (mm)": 8.7, "set": 1975, "year": 1975 }, { "ECDF": 0.42528735632183906, "beak depth (mm)": 8.8, "set": 1975, "year": 1975 }, { "ECDF": 0.3218390804597701, "beak depth (mm)": 8.6, "set": 1975, "year": 1975 }, { "ECDF": 0.367816091954023, "beak depth (mm)": 8.7, "set": 1975, "year": 1975 }, { "ECDF": 0.04597701149425287, "beak depth (mm)": 8, "set": 1975, "year": 1975 }, { "ECDF": 0.4367816091954023, "beak depth (mm)": 8.8, "set": 1975, "year": 1975 }, { "ECDF": 0.5517241379310345, "beak depth (mm)": 9, "set": 1975, "year": 1975 }, { "ECDF": 0.6896551724137931, "beak depth (mm)": 9.1, "set": 1975, "year": 1975 }, { "ECDF": 0.9080459770114943, "beak depth (mm)": 9.74, "set": 1975, "year": 1975 }, { "ECDF": 0.7011494252873564, "beak depth (mm)": 9.1, "set": 1975, "year": 1975 }, { "ECDF": 0.9310344827586207, "beak depth (mm)": 9.8, "set": 1975, "year": 1975 }, { "ECDF": 1, "beak depth (mm)": 10.4, "set": 1975, "year": 1975 }, { "ECDF": 0.13793103448275862, "beak depth (mm)": 8.3, "set": 1975, "year": 1975 }, { "ECDF": 0.7931034482758621, "beak depth (mm)": 9.44, "set": 1975, "year": 1975 }, { "ECDF": 0.5747126436781609, "beak depth (mm)": 9.04, "set": 1975, "year": 1975 }, { "ECDF": 0.5632183908045977, "beak depth (mm)": 9, "set": 1975, "year": 1975 }, { "ECDF": 0.5862068965517241, "beak depth (mm)": 9.05, "set": 1975, "year": 1975 }, { "ECDF": 0.8735632183908046, "beak depth (mm)": 9.65, "set": 1975, "year": 1975 }, { "ECDF": 0.8045977011494253, "beak depth (mm)": 9.45, "set": 1975, "year": 1975 }, { "ECDF": 0.3333333333333333, "beak depth (mm)": 8.65, "set": 1975, "year": 1975 }, { "ECDF": 0.8160919540229885, "beak depth (mm)": 9.45, "set": 1975, "year": 1975 }, { "ECDF": 0.8275862068965517, "beak depth (mm)": 9.45, "set": 1975, "year": 1975 }, { "ECDF": 0.5977011494252874, "beak depth (mm)": 9.05, "set": 1975, "year": 1975 }, { "ECDF": 0.3793103448275862, "beak depth (mm)": 8.75, "set": 1975, "year": 1975 }, { "ECDF": 0.8390804597701149, "beak depth (mm)": 9.45, "set": 1975, "year": 1975 }, { "ECDF": 0.14942528735632185, "beak depth (mm)": 8.35, "set": 1975, "year": 1975 } ] }, "encoding": { "color": { "field": "set", "type": "nominal" }, "x": { "field": "beak depth (mm)", "scale": { "zero": false }, "type": "quantitative" }, "y": { "field": "ECDF", "type": "quantitative" } }, "mark": "point" }, "image/png": "", "text/plain": [ "\n", "\n", "If you see this message, it means the renderer has not been properly enabled\n", "for the frontend that you are using. For more information, see\n", "https://altair-viz.github.io/user_guide/troubleshooting.html\n" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Compute ECDF of bootstrap sample\n", "x_bs, y_bs = bootcamp_utils.ecdf_vals(bs_sample)\n", "df_bs = pd.DataFrame(data={'beak depth (mm)': x_bs,\n", " 'ECDF': y_bs, \n", " 'set': 'bootstrap',\n", " 'year': 1975})\n", "\n", "# DataFrame for original data\n", "df_original = df.copy().loc[df['year']==1975, :]\n", "df_original['set'] = 1975\n", "\n", "# DataFrame for plotting\n", "df_plot = pd.concat([df_bs, df_original], ignore_index=True, sort=True)\n", "\n", "# Plot the ECDFs\n", "alt.Chart(df_plot\n", " ).mark_point(\n", " ).encode(x=alt.X('beak depth (mm):Q', scale=alt.Scale(zero=False)),\n", " y='ECDF:Q',\n", " color=alt.Color('set:N'))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is qualitatively similar, but obviously not exactly the same data set.\n", "\n", "Now, let's compute our bootstrap replicate. It's as simple as computing the mean of the bootstrap sample." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8.92609195402299" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bs_replicate = np.mean(bs_sample)\n", "bs_replicate" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, the mean of the bootstrap replicate is 8.93 mm, which is less than the mean of 8.96 from the original data set.\n", "\n", "Now, we can write a **`for`** loop to get lots and lots of bootstrap replicas. Note the since you are doing the replicates many many times, speed matters. For this reason, be sure you convert the data you are bootstrapping into a Numpy array. The calculations with them are **much** faster than with Pandas `Series`." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Number of replicatess\n", "n_reps = 100000\n", "\n", "# Initialize bootstrap replicas array\n", "bs_reps_1975 = np.empty(n_reps)\n", "\n", "# Compute replicates\n", "for i in range(n_reps):\n", " bs_sample = np.random.choice(bd_1975, size=len(bd_1975))\n", " bs_reps_1975[i] = np.mean(bs_sample)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have our replicas, 100,000 of them, we can plot an ECDF to see what we might expect of the mean if we were to do the experiment again." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "application/vnd.vegalite.v2+json": { "$schema": "https://vega.github.io/schema/vega-lite/v2.4.3.json", "config": { "view": { "height": 300, "width": 400 } }, "data": { "values": [ { "ECDF": 1e-05, "mean beak depth (mm)": 8.711149425287356 }, { "ECDF": 0.01001, "mean beak depth (mm)": 8.82183908045977 }, { "ECDF": 0.02001, "mean beak depth (mm)": 8.837356321839081 }, { "ECDF": 0.03001, "mean beak depth (mm)": 8.847586206896551 }, { "ECDF": 0.04001, "mean beak depth (mm)": 8.855057471264368 }, { "ECDF": 0.05001, "mean beak depth (mm)": 8.861379310344828 }, { "ECDF": 0.06001, "mean beak depth (mm)": 8.866666666666667 }, { "ECDF": 0.07001, "mean beak depth (mm)": 8.871149425287356 }, { "ECDF": 0.08001, "mean beak depth (mm)": 8.875402298850576 }, { "ECDF": 0.09001, "mean beak depth (mm)": 8.879195402298851 }, { "ECDF": 0.10001, "mean beak depth (mm)": 8.882758620689655 }, { "ECDF": 0.11001, "mean beak depth (mm)": 8.885977011494251 }, { "ECDF": 0.12001, "mean beak depth (mm)": 8.888965517241378 }, { "ECDF": 0.13001, "mean beak depth (mm)": 8.891724137931035 }, { "ECDF": 0.14001, "mean beak depth (mm)": 8.894597701149424 }, { "ECDF": 0.15001, "mean beak depth (mm)": 8.89712643678161 }, { "ECDF": 0.16001, "mean beak depth (mm)": 8.899655172413794 }, { "ECDF": 0.17001, "mean beak depth (mm)": 8.902068965517241 }, { "ECDF": 0.18001, "mean beak depth (mm)": 8.904367816091955 }, { "ECDF": 0.19001, "mean beak depth (mm)": 8.906551724137932 }, { "ECDF": 0.20001, "mean beak depth (mm)": 8.908850574712643 }, { "ECDF": 0.21001, "mean beak depth (mm)": 8.910919540229886 }, { "ECDF": 0.22001, "mean beak depth (mm)": 8.913103448275862 }, { "ECDF": 0.23001, "mean beak depth (mm)": 8.915057471264367 }, { "ECDF": 0.24001, "mean beak depth (mm)": 8.917011494252874 }, { "ECDF": 0.25001, "mean beak depth (mm)": 8.919080459770115 }, { "ECDF": 0.26001, "mean beak depth (mm)": 8.920919540229885 }, { "ECDF": 0.27001, "mean beak depth (mm)": 8.922758620689654 }, { "ECDF": 0.28001, "mean beak depth (mm)": 8.924482758620691 }, { "ECDF": 0.29001, "mean beak depth (mm)": 8.926206896551724 }, { "ECDF": 0.30001, "mean beak depth (mm)": 8.927931034482757 }, { "ECDF": 0.31001, "mean beak depth (mm)": 8.929655172413794 }, { "ECDF": 0.32001, "mean beak depth (mm)": 8.931264367816091 }, { "ECDF": 0.33001, "mean beak depth (mm)": 8.932988505747124 }, { "ECDF": 0.34001, "mean beak depth (mm)": 8.93471264367816 }, { "ECDF": 0.35001, "mean beak depth (mm)": 8.93632183908046 }, { "ECDF": 0.36001, "mean beak depth (mm)": 8.937931034482759 }, { "ECDF": 0.37001, "mean beak depth (mm)": 8.939540229885056 }, { "ECDF": 0.38001, "mean beak depth (mm)": 8.94103448275862 }, { "ECDF": 0.39001, "mean beak depth (mm)": 8.942528735632182 }, { "ECDF": 0.40001, "mean beak depth (mm)": 8.944137931034483 }, { "ECDF": 0.41001, "mean beak depth (mm)": 8.945747126436782 }, { "ECDF": 0.42001, "mean beak depth (mm)": 8.947356321839079 }, { "ECDF": 0.43001, "mean beak depth (mm)": 8.94873563218391 }, { "ECDF": 0.44001, "mean beak depth (mm)": 8.950344827586207 }, { "ECDF": 0.45001, "mean beak depth (mm)": 8.951724137931036 }, { "ECDF": 0.46001, "mean beak depth (mm)": 8.953218390804599 }, { "ECDF": 0.47001, "mean beak depth (mm)": 8.954827586206898 }, { "ECDF": 0.48001, "mean beak depth (mm)": 8.956321839080461 }, { "ECDF": 0.49001, "mean beak depth (mm)": 8.957816091954024 }, { "ECDF": 0.50001, "mean beak depth (mm)": 8.959425287356321 }, { "ECDF": 0.51001, "mean beak depth (mm)": 8.960919540229884 }, { "ECDF": 0.52001, "mean beak depth (mm)": 8.962528735632183 }, { "ECDF": 0.53001, "mean beak depth (mm)": 8.964022988505747 }, { "ECDF": 0.54001, "mean beak depth (mm)": 8.965517241379311 }, { "ECDF": 0.55001, "mean beak depth (mm)": 8.967126436781609 }, { "ECDF": 0.56001, "mean beak depth (mm)": 8.968620689655175 }, { "ECDF": 0.57001, "mean beak depth (mm)": 8.970229885057472 }, { "ECDF": 0.58001, "mean beak depth (mm)": 8.97183908045977 }, { "ECDF": 0.59001, "mean beak depth (mm)": 8.973448275862069 }, { "ECDF": 0.60001, "mean beak depth (mm)": 8.974942528735633 }, { "ECDF": 0.61001, "mean beak depth (mm)": 8.97655172413793 }, { "ECDF": 0.62001, "mean beak depth (mm)": 8.97816091954023 }, { "ECDF": 0.63001, "mean beak depth (mm)": 8.979770114942527 }, { "ECDF": 0.64001, "mean beak depth (mm)": 8.981379310344826 }, { "ECDF": 0.65001, "mean beak depth (mm)": 8.982988505747127 }, { "ECDF": 0.66001, "mean beak depth (mm)": 8.98448275862069 }, { "ECDF": 0.67001, "mean beak depth (mm)": 8.986206896551723 }, { "ECDF": 0.68001, "mean beak depth (mm)": 8.987816091954025 }, { "ECDF": 0.69001, "mean beak depth (mm)": 8.989540229885058 }, { "ECDF": 0.70001, "mean beak depth (mm)": 8.991264367816092 }, { "ECDF": 0.71001, "mean beak depth (mm)": 8.992988505747126 }, { "ECDF": 0.72001, "mean beak depth (mm)": 8.994827586206895 }, { "ECDF": 0.73001, "mean beak depth (mm)": 8.996666666666666 }, { "ECDF": 0.74001, "mean beak depth (mm)": 8.998505747126437 }, { "ECDF": 0.75001, "mean beak depth (mm)": 9.000344827586206 }, { "ECDF": 0.76001, "mean beak depth (mm)": 9.002413793103447 }, { "ECDF": 0.77001, "mean beak depth (mm)": 9.004367816091953 }, { "ECDF": 0.78001, "mean beak depth (mm)": 9.00632183908046 }, { "ECDF": 0.79001, "mean beak depth (mm)": 9.008390804597703 }, { "ECDF": 0.80001, "mean beak depth (mm)": 9.010574712643681 }, { "ECDF": 0.81001, "mean beak depth (mm)": 9.012873563218392 }, { "ECDF": 0.82001, "mean beak depth (mm)": 9.01528735632184 }, { "ECDF": 0.83001, "mean beak depth (mm)": 9.017701149425289 }, { "ECDF": 0.84001, "mean beak depth (mm)": 9.020344827586207 }, { "ECDF": 0.85001, "mean beak depth (mm)": 9.022988505747126 }, { "ECDF": 0.86001, "mean beak depth (mm)": 9.025632183908048 }, { "ECDF": 0.87001, "mean beak depth (mm)": 9.02862068965517 }, { "ECDF": 0.88001, "mean beak depth (mm)": 9.031494252873564 }, { "ECDF": 0.89001, "mean beak depth (mm)": 9.034827586206895 }, { "ECDF": 0.90001, "mean beak depth (mm)": 9.038275862068968 }, { "ECDF": 0.91001, "mean beak depth (mm)": 9.042068965517242 }, { "ECDF": 0.92001, "mean beak depth (mm)": 9.045977011494255 }, { "ECDF": 0.93001, "mean beak depth (mm)": 9.050459770114943 }, { "ECDF": 0.94001, "mean beak depth (mm)": 9.05528735632184 }, { "ECDF": 0.95001, "mean beak depth (mm)": 9.061149425287358 }, { "ECDF": 0.96001, "mean beak depth (mm)": 9.06793103448276 }, { "ECDF": 0.97001, "mean beak depth (mm)": 9.076091954022989 }, { "ECDF": 0.98001, "mean beak depth (mm)": 9.087011494252874 }, { "ECDF": 0.99001, "mean beak depth (mm)": 9.103908045977013 } ] }, "encoding": { "x": { "field": "mean beak depth (mm)", "scale": { "zero": false }, "type": "quantitative" }, "y": { "field": "ECDF", "type": "quantitative" } }, "mark": "point" }, "image/png": "", "text/plain": [ "\n", "\n", "If you see this message, it means the renderer has not been properly enabled\n", "for the frontend that you are using. For more information, see\n", "https://altair-viz.github.io/user_guide/troubleshooting.html\n" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Compute ECDF\n", "x, y = bootcamp_utils.ecdf_vals(bs_reps_1975)\n", "\n", "# Thinned set for plotting\n", "df_ecdf = pd.DataFrame(data={'mean beak depth (mm)': x[::1000],\n", " 'ECDF': y[::1000]})\n", "\n", "# Make the plot\n", "alt.Chart(df_ecdf\n", " ).mark_point(\n", " ).encode(\n", " x=alt.X('mean beak depth (mm):Q', scale=alt.Scale(zero=False)),\n", " y='ECDF:Q')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It looks Normally distributed, and in fact it must be Normally distributed by the Central Limit Theorem (which we will not discuss here, but we didn't really need to derive; hacker statistics brought us here!). The most probable mean (located at the inflection point on the CDF) we would get is 8.96 mm, which was what was measured, but upon repeating the experiment, we could get a mean as low as about 8.7 mm or as high as about 9.2 mm.\n", "\n", "Let's compute the 95% confidence interval." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([8.84298851, 9.08103448])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "conf_int_1975 = np.percentile(bs_reps_1975, [2.5, 97.5])\n", "conf_int_1975" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Aside: list comprehensions\n", "\n", "The construction we had for making our bootstrap replicates was a bit clunky:\n", "\n", " # Initialize bootstrap replicas array\n", " bs_reps_1975 = np.empty(n_reps)\n", "\n", " # Compute replicates\n", " for i in range(n_reps):\n", " bs_sample = np.random.choice(bd_1975, size=len(bd_1975))\n", " bs_reps_1975[i] = np.mean(bs_sample)\n", " \n", " We had to set up an empty array, and then loop through each index, draw a bootstrap sample, compute its mean to get the replicate, and then place it in the array. We could, instead, write a function to compute a bootstrap replicate from data." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": true }, "outputs": [], "source": [ "def draw_bs_rep(data, func=np.mean):\n", " \"\"\"Compute a bootstrap replicate from data.\"\"\"\n", " bs_sample = np.random.choice(data, size=len(data))\n", " return func(bs_sample)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that this function is generic in that it can compute the replicate using any function, such as `np.median()`, `np.std()`, or anything else. With this function in hand, our code starts to look a little cleaner.\n", "```python\n", "# Initialize bootstrap replicas array\n", "bs_reps_1975 = np.empty(n_reps)\n", "\n", "# Compute replicates\n", "for i in range(n_reps):\n", " bs_reps_1975 = draw_bs_rep(bd_1975, func=np.mean)\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We still have the problem that we initialize the array and then do the **`for`** loop. Python offers us a very useful alternative to this kind of procedure of initializing and array and the looping over it: **list comprehensions**. This, as usual, is best seen by example." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": true }, "outputs": [], "source": [ "bs_reps_1975 = [draw_bs_reps(bd_1975, func=np.mean) for _ in range(n_reps)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To use a list comprehension to make a list, you enclose the expression in brackets. The first item in the brackets is what you want to put at each element of the list you are creating. Next, you use a **`for`** keyword to set up the iteration over the elements of the list. In our case, the index of the element does not matter; we just want to compute `n_reps` replicates, so we do not need to keep track of the index.\n", "\n", "The result is a list, so we might want to convert it to a Numpy array." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": true }, "outputs": [], "source": [ "bs_reps_1975 = np.array([draw_bs_rep(bd_1975, func=np.mean) for _ in range(n_reps)])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is much more concise and perhaps cleaner syntax.\n", "\n", "Just to show how values of an iterator can be passed for each element, here is how you can make a list of the first 10 cubes." ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[i**3 for i in range(1, 11)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Great! Now let's use a list comprehension to make bootstrap replicates for the 2012 samples." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# Compute replicates\n", "bs_reps_2012 = np.array([draw_bs_rep(bd_2012, func=np.mean) for _ in range(n_reps)])\n", " \n", "# Compute the confidence interval\n", "conf_int_2012 = np.percentile(bs_reps_2012, [2.5, 97.5])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's print the two confidence intervals next to each other for comparison." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[8.84298851 9.08103448]\n", "[9.07142857 9.30515873]\n" ] } ], "source": [ "print(conf_int_1975)\n", "print(conf_int_2012)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, the 95% confidence intervals for the 2012 and 1975 juuust overlap. This implies that the inherent variation in beak depths is likely not responsible for the observed difference. There was likely some selective pressure toward deeper beaks." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Equivalence of bootstrap samples and standard error of the mean\n", "\n", "The **standard error of the mean**, or SEM, is a measure of uncertainty of the estimate of the mean. In other words, if we did the set of measurements again, we would get a different mean. The variability in these measured means is described by the SEM. Specifically, it is the standard deviation of the Normal distribution describing the mean of repeated measurements. So, from bootcamp replicates, we can directly apply this formula." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.060493114594221874" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bs_sem = np.std(bs_reps_1975)\n", "bs_sem" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It can be shown analytically that the SEM can be computed directly from the measurements as the standard deviation of the measurements divided by the square root of the number of measurements." ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0.06074539219629801" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sem = np.std(bd_1975, ddof=1) / np.sqrt(len(bd_1975))\n", "sem" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Hey, we got the same result! Bootstrap replicates are easy to generate in general for any statistic, and the error on those statistics might not be as simple as for the mean." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Bootstrap confidence interval of the standard deviation\n", "\n", "We are not limited to computing bootstrap confidence intervals of the mean. We could compute bootstrap confidence intervals of any statistic, like the median, standard deviation, the standard deviation by the mean (coefficient of variation), whatever we like. Computing the confidence interval for the standard deviation is the same procedure as we have done; we just put `np.std` in for `np.mean`." ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.47735063 0.63863112]\n" ] }, { "data": { "application/vnd.vegalite.v2+json": { "$schema": "https://vega.github.io/schema/vega-lite/v2.4.3.json", "config": { "view": { "height": 300, "width": 400 } }, "data": { "values": [ { "ECDF": 1e-05, "std beak depth (mm)": 0.3605288536309346 }, { "ECDF": 0.01001, "std beak depth (mm)": 0.4621550214369033 }, { "ECDF": 0.02001, "std beak depth (mm)": 0.4733910801973468 }, { "ECDF": 0.03001, "std beak depth (mm)": 0.480462519928335 }, { "ECDF": 0.04001, "std beak depth (mm)": 0.48569141990629244 }, { "ECDF": 0.05001, "std beak depth (mm)": 0.4901477474949997 }, { "ECDF": 0.06001, "std beak depth (mm)": 0.49398083933094655 }, { "ECDF": 0.07001, "std beak depth (mm)": 0.49715443413288507 }, { "ECDF": 0.08001, "std beak depth (mm)": 0.5000368595212805 }, { "ECDF": 0.09001, "std beak depth (mm)": 0.5027803788868247 }, { "ECDF": 0.10001, "std beak depth (mm)": 0.5052150021931349 }, { "ECDF": 0.11001, "std beak depth (mm)": 0.5074869499404778 }, { "ECDF": 0.12001, "std beak depth (mm)": 0.509595234941124 }, { "ECDF": 0.13001, "std beak depth (mm)": 0.5116045341529277 }, { "ECDF": 0.14001, "std beak depth (mm)": 0.5136139727335934 }, { "ECDF": 0.15001, "std beak depth (mm)": 0.5154906770536313 }, { "ECDF": 0.16001, "std beak depth (mm)": 0.5173212198916054 }, { "ECDF": 0.17001, "std beak depth (mm)": 0.5190178942777656 }, { "ECDF": 0.18001, "std beak depth (mm)": 0.520631444819068 }, { "ECDF": 0.19001, "std beak depth (mm)": 0.5223072485968844 }, { "ECDF": 0.20001, "std beak depth (mm)": 0.523795033026343 }, { "ECDF": 0.21001, "std beak depth (mm)": 0.5252577521273755 }, { "ECDF": 0.22001, "std beak depth (mm)": 0.5266076869597716 }, { "ECDF": 0.23001, "std beak depth (mm)": 0.527946012085114 }, { "ECDF": 0.24001, "std beak depth (mm)": 0.5293453755552668 }, { "ECDF": 0.25001, "std beak depth (mm)": 0.530555613313879 }, { "ECDF": 0.26001, "std beak depth (mm)": 0.5318895191323307 }, { "ECDF": 0.27001, "std beak depth (mm)": 0.5332351772537147 }, { "ECDF": 0.28001, "std beak depth (mm)": 0.5344912865467235 }, { "ECDF": 0.29001, "std beak depth (mm)": 0.5357275826445592 }, { "ECDF": 0.30001, "std beak depth (mm)": 0.5369404623546462 }, { "ECDF": 0.31001, "std beak depth (mm)": 0.5381454283442596 }, { "ECDF": 0.32001, "std beak depth (mm)": 0.5392619354859725 }, { "ECDF": 0.33001, "std beak depth (mm)": 0.5404120749279843 }, { "ECDF": 0.34001, "std beak depth (mm)": 0.5415746286024524 }, { "ECDF": 0.35001, "std beak depth (mm)": 0.5427131967730697 }, { "ECDF": 0.36001, "std beak depth (mm)": 0.5437796070696931 }, { "ECDF": 0.37001, "std beak depth (mm)": 0.5448572910084512 }, { "ECDF": 0.38001, "std beak depth (mm)": 0.5460285759074254 }, { "ECDF": 0.39001, "std beak depth (mm)": 0.5470965653972502 }, { "ECDF": 0.40001, "std beak depth (mm)": 0.5482606325748169 }, { "ECDF": 0.41001, "std beak depth (mm)": 0.5493491266407394 }, { "ECDF": 0.42001, "std beak depth (mm)": 0.5503968229406228 }, { "ECDF": 0.43001, "std beak depth (mm)": 0.5514098242618723 }, { "ECDF": 0.44001, "std beak depth (mm)": 0.5524736288571968 }, { "ECDF": 0.45001, "std beak depth (mm)": 0.5535150053468655 }, { "ECDF": 0.46001, "std beak depth (mm)": 0.5545059420250147 }, { "ECDF": 0.47001, "std beak depth (mm)": 0.5555455541443516 }, { "ECDF": 0.48001, "std beak depth (mm)": 0.5565813017061432 }, { "ECDF": 0.49001, "std beak depth (mm)": 0.5576242235942295 }, { "ECDF": 0.50001, "std beak depth (mm)": 0.5586909278959149 }, { "ECDF": 0.51001, "std beak depth (mm)": 0.5597370237333584 }, { "ECDF": 0.52001, "std beak depth (mm)": 0.5607885658280511 }, { "ECDF": 0.53001, "std beak depth (mm)": 0.5618173989834259 }, { "ECDF": 0.54001, "std beak depth (mm)": 0.5629061062466184 }, { "ECDF": 0.55001, "std beak depth (mm)": 0.5639345200921049 }, { "ECDF": 0.56001, "std beak depth (mm)": 0.5649202764528131 }, { "ECDF": 0.57001, "std beak depth (mm)": 0.5659970863971073 }, { "ECDF": 0.58001, "std beak depth (mm)": 0.5670874146171371 }, { "ECDF": 0.59001, "std beak depth (mm)": 0.5681513040966086 }, { "ECDF": 0.60001, "std beak depth (mm)": 0.569193406120578 }, { "ECDF": 0.61001, "std beak depth (mm)": 0.5702639543861101 }, { "ECDF": 0.62001, "std beak depth (mm)": 0.5713064578957596 }, { "ECDF": 0.63001, "std beak depth (mm)": 0.5724127083034544 }, { "ECDF": 0.64001, "std beak depth (mm)": 0.5734287037344867 }, { "ECDF": 0.65001, "std beak depth (mm)": 0.5745245209346929 }, { "ECDF": 0.66001, "std beak depth (mm)": 0.5756437973930385 }, { "ECDF": 0.67001, "std beak depth (mm)": 0.5768001856979289 }, { "ECDF": 0.68001, "std beak depth (mm)": 0.5779468994429934 }, { "ECDF": 0.69001, "std beak depth (mm)": 0.5791368787909669 }, { "ECDF": 0.70001, "std beak depth (mm)": 0.5803036093571168 }, { "ECDF": 0.71001, "std beak depth (mm)": 0.5814840172410396 }, { "ECDF": 0.72001, "std beak depth (mm)": 0.5827004210435843 }, { "ECDF": 0.73001, "std beak depth (mm)": 0.583959315334479 }, { "ECDF": 0.74001, "std beak depth (mm)": 0.5851944152444047 }, { "ECDF": 0.75001, "std beak depth (mm)": 0.5864090478579754 }, { "ECDF": 0.76001, "std beak depth (mm)": 0.5877270350906588 }, { "ECDF": 0.77001, "std beak depth (mm)": 0.5891137080608755 }, { "ECDF": 0.78001, "std beak depth (mm)": 0.5904813731870432 }, { "ECDF": 0.79001, "std beak depth (mm)": 0.5919190928205486 }, { "ECDF": 0.80001, "std beak depth (mm)": 0.5932864147971236 }, { "ECDF": 0.81001, "std beak depth (mm)": 0.5947777312662709 }, { "ECDF": 0.82001, "std beak depth (mm)": 0.5962266294663185 }, { "ECDF": 0.83001, "std beak depth (mm)": 0.5978464677456113 }, { "ECDF": 0.84001, "std beak depth (mm)": 0.5994580280012797 }, { "ECDF": 0.85001, "std beak depth (mm)": 0.601143271544952 }, { "ECDF": 0.86001, "std beak depth (mm)": 0.6029067082700259 }, { "ECDF": 0.87001, "std beak depth (mm)": 0.6047494674252877 }, { "ECDF": 0.88001, "std beak depth (mm)": 0.6068864506299764 }, { "ECDF": 0.89001, "std beak depth (mm)": 0.6089716570246277 }, { "ECDF": 0.90001, "std beak depth (mm)": 0.6113091933264394 }, { "ECDF": 0.91001, "std beak depth (mm)": 0.6137152862979307 }, { "ECDF": 0.92001, "std beak depth (mm)": 0.6162105521463929 }, { "ECDF": 0.93001, "std beak depth (mm)": 0.6191895665962794 }, { "ECDF": 0.94001, "std beak depth (mm)": 0.6223479338119813 }, { "ECDF": 0.95001, "std beak depth (mm)": 0.6259946979574097 }, { "ECDF": 0.96001, "std beak depth (mm)": 0.6301739206757984 }, { "ECDF": 0.97001, "std beak depth (mm)": 0.6353569273685763 }, { "ECDF": 0.98001, "std beak depth (mm)": 0.6422672961419463 }, { "ECDF": 0.99001, "std beak depth (mm)": 0.653001565888005 } ] }, "encoding": { "x": { "field": "std beak depth (mm)", "scale": { "zero": false }, "type": "quantitative" }, "y": { "field": "ECDF", "type": "quantitative" } }, "mark": "point" }, "image/png": "", "text/plain": [ "\n", "\n", "If you see this message, it means the renderer has not been properly enabled\n", "for the frontend that you are using. For more information, see\n", "https://altair-viz.github.io/user_guide/troubleshooting.html\n" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Compute replicates\n", "bs_reps_1975 = np.array([draw_bs_rep(bd_1975, func=np.std) for _ in range(n_reps)])\n", " \n", "# Compute confidence interval\n", "conf_int_1975 = np.percentile(bs_reps_1975, [2.5, 97.5])\n", "print(conf_int_1975)\n", "\n", "# Compute ECDF\n", "x, y = bootcamp_utils.ecdf_vals(bs_reps_1975)\n", "\n", "# Thinned set for plotting\n", "df_ecdf = pd.DataFrame(data={'std beak depth (mm)': x[::1000],\n", " 'ECDF': y[::1000]})\n", "\n", "# Make the plot\n", "alt.Chart(df_ecdf\n", " ).mark_point(\n", " ).encode(\n", " x=alt.X('std beak depth (mm):Q', scale=alt.Scale(zero=False)),\n", " y='ECDF:Q')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So, we now also have an estimate for the variability in beak depth. It could range from about 0.48 to 0.64 mm." ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.6" } }, "nbformat": 4, "nbformat_minor": 2 }