library(tidyverse)
library(tidymodels)
manhattan <- read_csv("data/manhattan.csv")

Bulletin

Learning goals

Rent in Manhattan

On a given day in 2018, twenty one-bedroom apartments were randomly selected on Craigslist Manhattan from apartments listed as “by owner”. The data are in the manhattan data frame. We will use this sample to conduct inference on the typical rent of one-bedroom apartments in Manhattan.

Exercise 1

Suppose you are interested in whether the mean rent of one-bedroom apartments in Manhattan is actually less than $3000. Choose the correct null and alternative hypotheses.

  1. \(H_0: \mu = 3000 \text{ vs. }H_a: \mu \neq 3000\)
  2. \(H_0: \mu = 3000 \text{ vs. }H_a: \mu < 3000\)
  3. \(H_0: \mu = 3000 \text{ vs. }H_a: \mu > 3000\)
  4. \(H_0: \bar{x} = 3000 \text{ vs. }H_a: \bar{x} \neq 3000\)
  5. \(H_0: \bar{x} = 3000 \text{ vs. }H_a: \bar{x} < 3000\)
  6. \(H_0: \bar{x} = 3000 \text{ vs. }H_a: \bar{x} > 3000\)

Exercise 2

Let’s use simulation-based methods to conduct the hypothesis test specified in Exercise 1. We’ll start by generating the null distribution.

Fill in the code and uncomment the lines below to generate then visualize null distribution.

set.seed(101321)
#null_dist <- manhattan %>%
  #specify(response = ______) %>%
  #hypothesize(null = ______, mu = ______) %>%
  #generate(reps = 100, type = "bootstrap") %>%
  #calculate(stat = _____)
#visualize(null_dist)

Exercise 3

Fill in the code and uncomment the lines below to calculate the p-value using the null distribution from Exercise 2.

mean_rent <- manhattan %>% 
  summarise(mean_rent = mean(rent)) %>%
  pull()
#null_dist %>%
 # get_p_value(obs_stat = ___ , direction = "____")

Fill in the direction in the code below and uncomment to visualize the shaded area used to calculate the p-value.

#visualize(null_dist) +
 # shade_p_value(obs_stat = mean_rent, direction = "______")

Let’s think about what’s happening when we run get_p_value. Fill in the code below to calculate the p-value “manually” using some of the dplyr functions we’ve learned.

#null_dist %>%
#  filter(_____) %>%
#  summarise(p_value = ______)

Exercise 4

Use the p-value to make your conclusion using a significance level of 0.05. Remember, the conclusion has 3 components

  • How the p-value compares to the significance level
  • The decision you make with respect to the hypotheses (reject \(H_0\) /fail to reject \(H_0\)).
  • The conclusion in the context of the analysis question.

Exercise 5

Suppose instead you wanted to test the claim that the mean price of rent is not equal to $3000. Which of the following would change? Select all that apply.

  1. Null hypothesis
  2. Alternative hypothesis
  3. Null distribution
  4. p-value

Exercise 6

Let’s test the claim in Exercise 5. Conduct the hypothesis test, then state your conclusion in the context of the data.

## add code

Exercise 7

Create a new variable over2500 that indicates whether or not the rent is greater than $2500.

# add code

Suppose you are interested in testing whether a majority of one-bedroom apartments in Manhattan have rent greater than $2500.

  • State the null and alternative hypotheses.

  • Fill in the code to generate the null distribution.

#null_dist <- ____ %>%
#  specify(response = ____, success = "_____") %>%
#  hypothesize(null = "point", p = ____) %>%
#  generate(reps = 100, type = "draw") %>%
#  calculate(stat = "prop")
  • Visualize the null distribution and shade in the area used to calculate the p-value.
# add code 
  • Calculate p-value. Then use the p-value to make your conclusion using a significance level of 0.05.
# add code