library(tidyverse)
library(knitr)
library(infer)
Due tomorrow:
Upcoming:
hw03
released today, due next ThursdayMisc:
Your midterm grade is computed using the assignment grades reported on sakai but according to the weight scale described on the syllabus
For purposes of the midterm grade, no lowest assignment scores were dropped.
mean ae score (as %) * 2.5
+ mean quiz score (as %) * 5
+ mean lab score (as %) * 15
+ exam 01 score (as %) * 17.5
+ hw score (as %) * 25
all divided by 65.
What’s the difference between a parameter and a statistic?
Is a parameter “random” or “fixed”? Typically “Unknown” or “known”? Why?
On a given day in 2018, twenty one-bedroom apartments were randomly selected on Craigslist Manhattan from apartments listed as “by owner”. The data are in the manhattan
data frame. We will use this sample to conduct inference on the typical rent of 1 bedroom apartments in Manhattan.
manhattan <- read_csv("data/manhattan.csv")
Visualize the distribution of rent
. Is the mean or the median a better measure of typical rent of one-bedroom apartments in Manhattan?
What is a point estimate of the typical rent?
Let’s bootstrap!
Fill in the values from the bootstrap sample conducted in class. Once the values are filled in, uncomment the code.
# class_bootstrap <- c()
# add code
Does this statistic align with your expectations?
Here we’ve take one bootstrap sample, but in practice we will need about 10,000 - 15,000! In the next lecture we will discuss how we can calculate bootstrap samples using the infer
package in R.
Sneak peek!
boot_dist = manhattan %>%
# specify the variable of interest
specify(response = rent) %>%
# generate 15000 bootstrap samples
generate(reps = 15000, type = "bootstrap") %>%
# calculate the statistic of each bootstrap sample
calculate(stat = "mean")
boot_dist %>%
ggplot(aes(x = stat)) +
geom_histogram() +
labs(x = "mean rent")