Week 2
Projects, reading in data, data frames, summarizing data, visualizing data with ggplot2
Topics
Part 2:
- Projects
- Reading in data
- Tidy data
- Data frames
- Getting to know a dataset
- Visualizing data with ggplot2 (intro)
Announcements
- I updated the Grading Scale in the syllabus
- Class materials for BSTA 526 will be provided in the shared OneDrive folder BSTA_526_W26_class_materials_public.
- For today’s class, make sure to download to your computer the folder called
part2. - Open RStudio by double-clicking on the project file called
BSTA_526_W26_class_materials_public.Rprojin the main OneDrive folder.
Class materials
| Part | OneDrive folder | Slides | Webpage |
|---|---|---|---|
| 2 |
Readings
Required
- Data Organization in Spreadsheets by Kara Woo and Karl Broman
- This paper is a must read for anyone that works with data.
- Note in particular Section 4: Write Dates as YYYY-MM-DD
- Absolute and Relative File Paths (Section 2.3) from Data Science: A First Introduction by Timbers, Campbell, & Lee
- Understanding file paths can be difficult if you are not used to working with files on your computer. For programming, it is important to understand file paths and in particular the difference between absolute and relative paths. This is a great follow up reading.
Optional
- Exploring missing values in
naniar. The notes for part 2 refer to thevisdatpackage for visualizing missing values in your data. I also recommend thenaniarpackage, and the link above is a great tutorial with an introduction to some really useful visualizations for missingness.
Post-class survey
- Please fill out the post-class survey to provide feedback. Thank you!
- Previous muddiest points and clearest points with responses are collected here.
Homework
- See OneDrive folder for homework assignment.
- HW 2 due on 01/22.
Recording
- In-class recording links are on Sakai. Navigate to Course Materials -> Schedule with links to in-class recordings.