Week 5

Data cleaning, reshaping, and wrangling with multiple tables
Published

February 6, 2025

Modified

February 12, 2025

Topics

  • Practice using here() to load data in a subfolder of the project
  • Learn and apply bind_rows() to combine rows from two or more datasets
  • Practice working with and cleaning real data using forcats, stringr, separate()
  • Learn about the different kinds of joins and how they merge data
    • Apply inner_join() and left_join() to join tables on columns
  • Will cover Week 6: Utilize pivot_longer() to make a wide dataset long

Announcements

  • If you haven’t already, please sign up for your function of the week presentation.
    • Please limit presentations to 3 per week.
    • Link to sign-up sheet is posted on Sakai.
  • The Midterm is now posted in Dropbox. It is due Sunday 2/23/25.
    • Please start early on this since finding a suitable dataset might take some time.
    • We encourage you to meet with either of us to discuss your research question and data to make sure you are on the right track.
  • Class materials for BSTA 526 will be provided in the shared Dropbox folder BSTA_526_W25_class_materials_public.
  • For today’s class, make sure to download to your computer the folder called part_05, and then open RStudio by double-clicking on the file called part_05.Rproj.

Class materials

  • Readings
    • Note: I updated the Week 4 Readings to include topics that we had moved from part 5 to 4 this year.
  • Dropbox part_05 Project folder
    • We got through Section 12 of the html file (joining data) and will cover Section 13 (Reshaping data) during Week 6.

Post-class survey

Homework

  • See Dropbox part 5 folder for homework assignment.
    • HW 5 was updated after class to remove questions 4 & 5, which will be on HW 6 instead. Make sure to use the file hw_05_b526_v2.qmd on Dropbox.
  • HW 5 due on 2/13.

Recording

  • In-class recording links are on Sakai. Navigate to Course Materials -> Schedule with links to in-class recordings. Note that the password to the recordings is at the top of the page.