Week 4

Errors, data manipulation with dplyr, ggplot: themes, colors, facets
Published

January 29, 2026

Modified

February 25, 2026

Topics

Part 4:

  • Learn and apply mutate() to change the data type of a variable
  • Apply mutate() to calculate a new variable based on other variables in a data.frame.
  • Apply case_when in a mutate() statement to make a continuous variable categorical
  • Learn how to mutate() across() multiple columns at once.
  • Learn how to summarize() data with group_by() to summarize within categories
  • Learn how to summarize() data with multiple columns and functions at once, also with across().
  • Learn about the factor variable type and how they differ from character vectors
  • Learn to change scales and palettes of ggplots.

Announcements

  • Functions of the Week
    • Signup sheet for your functions of the week presentations. The file is in OneDrive in the functions_of_the_week folder.
    • Presentations will be during weeks 5-10. Please no more than 4 presentations per week.
    • If there is a function you are interested in presenting that is not on the signup sheet, please check with me. If it hasn’t been covered before and isn’t covered in the class, then most likely I will approve it.
  • TA’s office hours are Mondays 3:00-4:00 pm (Michael Daily) via Webex. Webex link is on main course page of Sakai.
  • Cascadia R Conf in June 26-27 this year. It will be held at OHSU in RLSB. This is a great conference to meet other R enthusiasts in the area and learn more about what they are working on.

Class materials

  • Class materials in OneDrive folder BSTA_526_W26_class_materials_public.
  • For today’s class, make sure to download to your computer the folder called part4.
  • Open RStudio by double-clicking on the project file called BSTA_526_W26_class_materials_public.Rproj in the main OneDrive folder.
Part OneDrive folder Slides Webpage
4

Readings

R4DS = R for Data Science (2e)

Required

  • R4DS book sections:
    • mutate(): Section 3.3.1
    • rename()`: Section 3.3.3
    • group_by() and summarize(): Section 3.5 - The 1st edition’s Section 5.6 on Grouped summaries with summarise() is more detailed (wordier) and worth looking at as well.
    • More group_by() and counts with n(): Section 3.6
    • case_when() within mutate(): Section 12.5.2
    • across() with summarize() or mutate(): Section 26.2 (26.2.1 - 26.2.4)
    • factor(): Sections 16.1 & 16.2 - Later in the quarter we will continue to work more with factors, covered in the other Chapter 16 sections.

Optional

  • Column-wise operations vignette - more summarize() with across() examples
  • Data Science: A First Introduction
    • summarize() and across(): Section 3.9 (skip part on using map() in section 3.9.4)
    • mutate() with across(): Section 3.10
  • ggplot2: Elegant Graphics for Data Analysis
    • Scales & guides: Chapter 14
  • Coding style guides

Post-class survey

  • Please fill out the post-class survey to provide feedback. Thank you!
  • Previous muddiest points and clearest points with responses are collected here.

Homework

  • See OneDrive folder for homework assignment.
  • HW 4 due on 02/05.

Recording

  • In-class recording links are on Sakai. Navigate to Course Materials -> Schedule with links to in-class recordings.