library(tidyverse)
Load in the data
class = read_csv("data/classData.csv")
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## lab_section = col_double(),
## local_R = col_logical(),
## work_with = col_double(),
## bender = col_character(),
## predicted_score = col_double()
## )
Take a look at the data
glimpse(class)
## Rows: 88
## Columns: 5
## $ lab_section <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
## $ local_R <lgl> TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, TRUE, TRUE, FA…
## $ work_with <dbl> 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, NA, 1, 1, 1,…
## $ bender <chr> "Airbender", "Firebender", "Firebender", "Waterbender"…
## $ predicted_score <dbl> 89.43310, 74.27888, 71.08258, 54.80504, 59.83693, 82.3…
Which nation is the most represented? Least represented? Make a visualization to illustrate.
Are any lab sections predicted to do better than others on an airbender test? Is there a correlation between lab section and predicted score?
Why?
How could you see this on the plot above?