library(tidyverse)
library(tidymodels)
manhattan <- read_csv("data/manhattan.csv")
law = read_csv("data/lsat_gpa.csv")
infer
to obtain a bootstrap distributionOn a given day in 2018, twenty one-bedroom apartments were randomly selected on Craigslist Manhattan from apartments listed as “by owner”. The data are in the manhattan
data frame. We will use this sample to conduct inference on the typical rent of 1 bedroom apartments in Manhattan.
Let’s start by using bootstrapping to estimate the mean rent of one-bedroom apartments in Manhattan.
What is the point estimate of the typical rent?
Recap: last time we did a manual bootstrap by sampling with replacement the rent values from a box.
class_bootstrap <- c(2150, 1795, 3800, 3800, 3200, 3950, 3800, 3267, 2300, 2300, 2300, 3267, 2350, 1570, 2350, 2175, 1775, 4195, 2350, 4195)
# add code
We will use the infer
package, included as part of tidymodels
to calculate a 95% confidence interval for the mean rent of one-bedroom apartments in Manhattan.
We start by setting a seed to sure our analysis is reproducible. We’ll use 101221 to set our seed to today’s date but you can use any value you want on assignments.
set.seed(101221)
We can use R to take many bootstrap samples and generate a bootstrap distribution
Uncomment the lines and fill in the blanks to create the bootstrap distribution of sample means and save the results in the data frame boot_dist
.
Use 500 reps for the in-class activity. (You will use about 15,000 reps for assignments outsdie of class.)
boot_dist <- manhattan #%>%
#specify(______) %>%
#generate(______) %>%
#calculate(______)
boot_dist
?boot_dist
? What do they mean?Visualize the bootstrap distribution using a histogram. Describe the shape, center, and spread of this distribution.
# add code
Uncomment the lines and fill in the blanks to construct the 95% bootstrap confidence interval for the mean rent of one-bedroom apartments in Manhattan.
#___ %>%
# summarize(lower = quantile(______),
# upper = quantile(______))
Write the interpretation for the interval calculated above.
#calculate a 90% confidence interval
#calculate a 99% confidence interval
Next, use bootstrapping to estimate the median rent for one-bedroom apartments in Manhattan.
boot_dist_median
.## add code
## add code
law
contains data about LSAT (law school exam) scores and GPA (grade point average)
What’s the correlation between LSAT score and GPA?
# law %>%
Report a 95% bootstrap confidence interval on the sample correlation
# add code