Lab #04: Team Workflow & Spatial Data

due Tue September 21 at 11:59 PM

Goals

Getting started

Check the announcement on Sakai or slack to determine your team number and members.

Before getting started on the lab, take a few minutes to develop a plan for working as a team for the rest of the semester.

Put your plan into writing here. Include:

Every team member should now go to the course GitHub organization and locate your lab04 repository, which should have the prefix lab04. Copy the URL of the repository and clone in RStudio. If you have trouble, see the first lab and lecture for step-by-step instructions or ask a teammate for help.

Do not edit the .Rmd file until explicitly asked to do so in the instructions.

North Carolina Congressional Districts and Redistricting.

In this lab you will use the sf package to visualize district-level congressional district data in the most recent congressional and presidential elections in North Carolina. Also, given the upcoming redistricting following the 2020 census, we will consider which districts may be overpopulated and thus may need to shrink.

The variables in the nc_dists dataset are as follows:

The variables in the nc_newdata file are as follows:

The presidential election data is from DailyKos Elections and the House election data is from the The CQ Voting and Elections Collection, accessed through Duke Libraries. The population data is from the 2020 census and has been compiled by Daily Kos Elections.

  1. Before getting started, please write a group contract on how you will work together as a team this semester…

Team workflow

Assign each team member a number 1 through 3 (or 4) and write your number down on a piece of paper. This lab will walk you through the basics of team workflow step-by-step.

Do the following exercises in order, following each step carefully.

Only one person at a time should type in the .Rmd file and push updates.

If you are working on any portion of the lab virtually, the person working should share their screen and the others should follow along.

Team member 1: Open the lab04.Rmd file and change the author of the YAML header to the following “Team Number: Member 1, Member 2, Member 3” with your team number (for example 05-1) and the first and last names of all team members.

Team member 1: Run the load-data code chunk to read in the data and print the first six rows. Share the results with your team members. Then, answer the questions below.

  1. Join the nc and nc_newdata datasets to form a dataset called nc_data. (Hint: the as.character command may be useful here.)

Team member 1: When you have finished, knit to PDF, then stage, commit, and push your .Rmd and PDF to GitHub with an appropriate commit message.

All other team members: Once your team member has pushed the work, pull to get the updated documents from GitHub. Click on the .Rmd file and you should see the responses to the second exercise. Run the code chunks for #2 to update the file in your environment.

Team member 2: It’s your turn. Answer the questions below.

  1. Modify the nc-plot-1 code chunk to create a plot of the 2020 presidential election in North Carolina by congressional District. Use informative colors with the scale_fill_gradient2 function. The colors “#DE0100” and “#0015BC” look good for Republicans and Democrats, respectively, but the choice is yours. Please set a midpoint at 50% of the vote. Give the plot an informative title, subtitle, and legend title. When you are finished, remove eval = FALSE.

Team member 2: Knit to PDF, then stage, commit, and push your .Rmd and PDF to GitHub with an appropriate commit message.

All other team members: Once your team member has pushed the work, pull to get the updated documents from GitHub. Click on the .Rmd file and you should see the responses to the first three exercises. Run the code chunks for #2 and #3 to update the file in your environment.

Team member 3: It’s your turn. Complete the exercises below. (If you have four team members, you could have person 3 do #4 and person 4 do #5. The main point is to make sure everyone is contributing!)

  1. Create a new variable using mutate() in the gop-change code chunk. Overwrite the nc_data dataset to add a variable that measures the change in Republican vote share from 2020 in 2016 called trump_change. This variable should represent how much better (or worse) Trump did in a district in 2020 comprared to 2016. When you are finished, remove eval = FALSE.

  2. Modify the nc-plot-2 code chunk to create a plot of the Trump vote share change from 2016 to 2020, with districts colored according to trump_change. When making the scale, use informative colors, and set a midpoint at 0 (thus representing no change from 2016 to 2020.) When you are finished, remove eval = FALSE. Please describe what you observe. Did Trump improve anywhere in the state in 2020 compared to 2016?

Team member 3: Knit to PDF, then stage, commit, and push your .Rmd and PDF to GitHub with an appropriate commit message.

All other team members: Once your team member has pushed the work, pull to get the updated documents from GitHub. Click on the .Rmd file and you should see the responses to the first five exercises. Run the code chunks for #1 - #4 to update the file in your environment.

Team member 1: It’s your turn. Complete the exercise below.

  1. Now, we are going to compare the Republican presidential performance in 2020 to the Republican congressional performance at the district level. The variable house_gop_pct_2020 represents the two-party Republican percentage of the U.S. House vote in 2020. First, I would like you to create a new variable called gop_house_overperformance that measures the difference between the percentage of the vote received by the U.S. House candidate and the percentage received by the Republican presidential candidate in 2020. Modify the nc-plot-3 code chunk to create a plot of the House Republican overperformance in 2020, with districts colored according to gop_house_overperformance. When making the scale, use informative colors, and set a midpoint at 0 (thus representing no difference between the House Republican candidate and Trump’s vote percentage). When you are finished, remove eval = FALSE.

Is this visualization informative? Why or why not? (You may need to do some research on the 2020 NC US House elections to answer this question- any major news website that has NC US House election results may be useful here).

Team member 1: Knit to PDF, then stage, commit, and push your .Rmd and PDF to GitHub with an appropriate commit message.

All other team members: Once your team member has pushed the work, pull to get the updated documents from GitHub. Click on the .Rmd file and you should see the responses to the first five exercises. Run the code chunks for #1 - #4 to update the file in your environment.

Team member 2: Almost done! Just one more question…

  1. North Carolina is going to have to draw new congressional districts for 2022, following the 2020 Census and they will have to all have an equal population. North Carolina will now have a 14th Congressional District for the first time in its history! You will notice if you look at the POPULATION variable that, in 2010, there was a deviation of no more than 1 person in each district. Once again North Carolina will have to draw districts that have equal population. The variable population_2020 contains a variable measuring the population of the current (2010 census) districts.

In the nc_plot-4 chunk, please make a map of each district’s population as of 2020, with districts colored based upon the district’s estimated population in 2020. When making the scale use informative colors; you do not have to set a midpoint. When you are finished, remove eval = FALSE. What do you observe? Which regions of the state have the most overpopulated districts?

Team member 2: Check to confirm all code chunks are named, code does not exceed the 80 character limit and all code follows the tidyverse style guidelines. Make changes as necessary.

Team member 2: When you have finished, knit to PDF, then stage, commit, and push your .Rmd and PDF to GitHub with an appropriate commit message.

All other team members: Once your team member has pushed the work, pull to get the updated documents from GitHub. Click on the .Rmd file to see your final version of the lab.

Team member 3: Upload your team’s PDF to Gradescope. Include every team member’s name in the Gradescope submission and identify which problems are on each in Gradescope. Associate the “Overall” section with the first page of your PDF.

There should only be one submission per team on Gradescope.

Rubric: