library(tidyverse)
library(tidymodels)
manhattan <- read_csv("data/manhattan.csv")
On a given day in 2018, twenty one-bedroom apartments were randomly selected on Craigslist Manhattan from apartments listed as “by owner”. The data are in the manhattan
data frame. We will use this sample to conduct inference on the typical rent of one-bedroom apartments in Manhattan.
Suppose you are interested in whether the mean rent of one-bedroom apartments in Manhattan is actually less than $3000. Choose the correct null and alternative hypotheses.
Let’s use simulation-based methods to conduct the hypothesis test specified in Exercise 1. We’ll start by generating the null distribution.
Fill in the code and uncomment the lines below to generate then visualize null distribution.
set.seed(101321)
#null_dist <- manhattan %>%
#specify(response = ______) %>%
#hypothesize(null = ______, mu = ______) %>%
#generate(reps = 100, type = "bootstrap") %>%
#calculate(stat = _____)
#visualize(null_dist)
Fill in the code and uncomment the lines below to calculate the p-value using the null distribution from Exercise 2.
mean_rent <- manhattan %>%
summarise(mean_rent = mean(rent)) %>%
pull()
#null_dist %>%
# get_p_value(obs_stat = ___ , direction = "____")
Fill in the direction in the code below and uncomment to visualize the shaded area used to calculate the p-value.
#visualize(null_dist) +
# shade_p_value(obs_stat = mean_rent, direction = "______")
Let’s think about what’s happening when we run get_p_value
. Fill in the code below to calculate the p-value “manually” using some of the dplyr
functions we’ve learned.
#null_dist %>%
# filter(_____) %>%
# summarise(p_value = ______)
Use the p-value to make your conclusion using a significance level of 0.05. Remember, the conclusion has 3 components
Suppose instead you wanted to test the claim that the mean price of rent is not equal to $3000. Which of the following would change? Select all that apply.
Let’s test the claim in Exercise 5. Conduct the hypothesis test, then state your conclusion in the context of the data.
## add code
Create a new variable over2500
that indicates whether or not the rent is greater than $2500.
# add code
Suppose you are interested in testing whether a majority of one-bedroom apartments in Manhattan have rent greater than $2500.
State the null and alternative hypotheses.
Fill in the code to generate the null distribution.
#null_dist <- ____ %>%
# specify(response = ____, success = "_____") %>%
# hypothesize(null = "point", p = ____) %>%
# generate(reps = 100, type = "draw") %>%
# calculate(stat = "prop")
# add code
# add code