HW 1: BSTA 511-611 F22

Author

Your name here - update this!!!!

Published

October 7, 2023

Due 10/7/23

Download the .qmd file for this assignment from https://github.com/niederhausen/BSTA_511_F23/blob/main/homework/HW_1_F23_bsta511.qmd

Directions

  • Please upload your homework to Sakai. Upload both your .qmd code file and the rendered .html file.
  • For each question, make sure to include all code and resulting output in the html file to support your answers.
  • Use the assignment .qmd file as a template for your own assignment.
Tip

It is a good idea to try rendering your document from time to time as you go along! Note that rendering automatically saves your Qmd file and rendering frequently helps you catch your errors more quickly.

Non-book exercises

Upload a photo using Sakai submission

To help me learn your names and faces, please upload a photo of yourself on Sakai. You will find the Upload Photo “assignment” in the Submissions section of Sakai. These photos will only be seen by me and the TA.

Background survey

Please fill out the background survey at https://forms.gle/QsjRN1UxzmcBAP819.
No work to be shown here.

Slack post

Introduce yourself to the class by posting a message in the #random channel on the BSTA 511/611 Slack group.
No work to be shown here.
Slack invite link: https://join.slack.com/t/slack-zlc9838/shared_invite/zt-23wilgaki-yTTT9KlePsEgW3Ik07gg

Book exercises

  • Exercises are in the last section of the chapter.
  • Exercises are numbered as chapter#.exercise#. For example, exercise 1.2 is Chapter 1 #2, which is on pg. 75.

1.2 Sinusitis and antibiotics, Part I.

  • Show the work of your calculations using R code within a code chunk. Make sure that both your code and output are visible in the rendered html file.
  • Write your answers in complete sentences as if communicating the results to a collaborator.
  • If you are having difficulty with exercise 1.2, take a look at exercise 1.1, whose answers are at the back of the book.

1.4 Buteyko method, study components

1.8 Smoking habits of UK residents

1.12 Herbal remedies

1.20 City council survey

1.31 Income at the coffee shop

1.32 Midrange

R exercises

  1. Load all the packages you need below here.

R1: Formatting text practice

Write a sentence (or a few) using all the different types of formatting text shown in slide 29 of the Day 1 slides. Your choice of text does not matter or even need to make sense. Although the TA will appreciate it if you make them laugh.

R2: BRFSS

The Behavioral Risk Factor Surveillance System (BRFSS) is an annual telephone survey of 350,000 people in the United States. The BRFSS is designed to identify risk factors in the adult population and report emerging health trends. For example, respondents are asked about their diet, weekly exercise, possible tobacco use, and health care coverage.

The dataset cdc is a sample of 20,000 people from the survey conducted in 2000, and contains responses from a subset of the questions asked on the survey.

Load the cdc dataset from the web using the source() command below:

source("http://www.openintro.org/stat/data/cdc.R")
  • Answer the questions below about the cdc dataset.
  • Please do not delete the statements of the questions so that they remained numbered in the correct order.
  • Show the work of your calculations using R code within a code chunk. Make sure that both your code and output are visible in the knitted html file.
  • Write your answers in complete sentences as if communicating the results to a collaborator.
  1. How many rows and columns are in the dataset?

  2. For each variable, what identify both its “statistical” variable type (numerical (discrete, continuous) or categorical (nominal, ordinal) and its R variable type.
    Fill in your answers in the table I created below.

variable name R type variable type
genhlth fill in fill in
exerany etc.
hlthplan
smoke100
height
weight
wtdesire
age
gender
  1. What is the difference between the average weight and the average desired weight?

  2. Which of the height, weight, and desired weight variables has the most variability? Which has the least variability?

  3. Calculate the mean of the hlthplan variable. How do we interpret this mean? In other words, what does this mean measure?