# install.packages('fivethirtyeightdata', repos = 'https://fivethirtyeightdata.github.io/drat/', type = 'source')
library(fivethirtyeightdata)
beep(5) #example 1beepr::beepr
BSTA 526 Functions of the Week
See the syllabus for links to presentations from previous years.
0.1 Instructions
- Please sign up for a function(s) here (Enter your name and the week you want to present): URL coming soon
- Please submit on Sakai
- both the
.qmdand the.htmlfiles, and - your dataset if you are loading your own dataset (without your dataset I will not be able to render the file and add it to the website)
- both the
- It is VERY important that your yaml is updated so that posting your file on the website is seamless. Update in the yaml above the following:
- title: use the format
package::function, such asdplyr::slice. - description: a brief description of your function(s)
- author: your name
- date : date you are presenting
- title: use the format
- Do not change the
pagetitleorsubtitle. - Your submission will be added to the class website. Remove your name from the yaml if you do not wish it to be included and let us know if it is okay to post it anonymously.
- Delete the sections with the
InstructionsandDataset instructionsfrom this file before submitting.
0.2 Dataset instructions
- Please use a dataset that is publicly available. In particular, do not use a dataset with PHI that we cannot publicly share.
- Include a description of the dataset and from where it was downloaded or how it was created.
- If these are data from a project you have worked on, make sure there are no identifying information and also slightly alter them so that they are not the original data.
0.2.1 Datasets that are included in an R package
It is easiest to use a dataset that is a part of base R or a part of an R package.
Some R packages that include datasets are:
- The datasets in the package
datasetsare included with base R and “ready” to use without having to load them first. Learn more about the available datasets here and here. - palmerpenguins package
- fivethirtyeight package
- A list of R packages and datasets included in them. This list is not comprehensive.
0.2.2 Load your own dataset
- If loading a dataset from a file(which could be one downloaded from the internet somewhere), make sure the dataset is in a folder called data and use
here::here()to load it. This is to make it easier to include it in the website. - Upload the dataset on Sakai along with your .qmd and .html files so that I can render your .qmd file.
1 Function(s) Name(s)
beep function in the beepr package
2 What is it for?
it plays a short sound once it is done running
3 Examples
Provide at least two examples that you have created yourself for your dataset of choice that show how to use the function(s). ## Example 1
These are the sound options for the beep.
“ping”
“coin”
“fanfare”
“complete”
“treasure”
“ready”
“shotgun”
“mario”
“wilhelm”
“facebook”
“sword”
3.1 example 2
nba_all_elo <- fivethirtyeightdata::nba_all_elo
skim(nba_all_elo)| Name | nba_all_elo |
| Number of rows | 63157 |
| Number of columns | 23 |
| _______________________ | |
| Column type frequency: | |
| character | 1 |
| Date | 1 |
| factor | 7 |
| logical | 1 |
| numeric | 13 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| notes | 16291 | 0.74 | 4 | 65 | 0 | 232 | 0 |
Variable type: Date
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| date_game | 0 | 1 | 1946-11-01 | 2015-06-16 | 1990-02-06 | 12426 |
Variable type: factor
| skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
|---|---|---|---|---|---|
| game_id | 0 | 1 | FALSE | 63157 | 194: 1, 194: 1, 194: 1, 194: 1 |
| lg_id | 0 | 1 | FALSE | 2 | NBA: 59008, ABA: 4149 |
| team_id | 0 | 1 | FALSE | 104 | BOS: 3100, NYK: 2855, LAL: 2559, DET: 2532 |
| fran_id | 0 | 1 | FALSE | 53 | Cel: 3100, Lak: 3019, Pis: 2902, Kni: 2855 |
| opp_id | 0 | 1 | FALSE | 104 | NYK: 2914, BOS: 2897, LAL: 2519, DET: 2453 |
| opp_fran | 0 | 1 | FALSE | 53 | Lak: 3005, Six: 2947, War: 2938, Kni: 2914 |
| game_result | 0 | 1 | FALSE | 2 | W: 39311, L: 23846 |
Variable type: logical
| skim_variable | n_missing | complete_rate | mean | count |
|---|---|---|---|---|
| is_playoffs | 0 | 1 | 0.06 | FAL: 59124, TRU: 4033 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| gameorder | 0 | 1 | 31579.00 | 18232.00 | 1.00 | 15790.00 | 31579.00 | 47368.00 | 63157.00 | ▇▇▇▇▇ |
| year_id | 0 | 1 | 1988.20 | 17.58 | 1947.00 | 1975.00 | 1990.00 | 2003.00 | 2015.00 | ▂▅▆▇▇ |
| seasongame | 0 | 1 | 43.52 | 25.39 | 1.00 | 22.00 | 43.00 | 65.00 | 108.00 | ▇▇▇▆▁ |
| pts | 0 | 1 | 104.60 | 15.00 | 2.00 | 95.00 | 104.00 | 114.00 | 184.00 | ▁▁▇▃▁ |
| elo_i | 0 | 1 | 1495.05 | 112.41 | 1105.62 | 1417.15 | 1500.43 | 1576.26 | 1836.66 | ▁▃▇▆▁ |
| elo_n | 0 | 1 | 1495.07 | 112.87 | 1100.29 | 1416.54 | 1500.55 | 1576.45 | 1838.72 | ▁▃▇▆▁ |
| win_equiv | 0 | 1 | 41.69 | 10.67 | 10.15 | 34.07 | 42.11 | 49.66 | 70.40 | ▁▅▇▆▁ |
| opp_pts | 0 | 1 | 100.86 | 14.39 | 0.00 | 92.00 | 101.00 | 110.00 | 186.00 | ▁▁▇▂▁ |
| opp_elo_i | 0 | 1 | 1495.42 | 111.87 | 1091.64 | 1417.34 | 1501.38 | 1575.92 | 1853.10 | ▁▃▇▆▁ |
| opp_elo_n | 0 | 1 | 1495.41 | 112.06 | 1085.77 | 1417.46 | 1501.35 | 1576.06 | 1853.10 | ▁▃▇▆▁ |
| forecast | 0 | 1 | 0.62 | 0.18 | 0.06 | 0.50 | 0.64 | 0.76 | 0.98 | ▁▃▆▇▃ |
| opp_win_equiv | 0 | 1 | 41.72 | 10.58 | 10.30 | 34.15 | 42.12 | 49.62 | 71.11 | ▁▅▇▆▁ |
| opp_seasongame | 0 | 1 | 43.54 | 25.36 | 1.00 | 22.00 | 43.00 | 65.00 | 107.00 | ▇▇▇▆▁ |
beep(3)beep(6) #this one's funnybeep(8) #marioooobeep("wilhelm") #can also run by stating the name instead of the number 3.2 Example 3
# Run a loop and beep when complete
total <- 0
for (i in 1:100) {
total <- total + i
}
beep(8) # plays a mario sound to signal the loop is done
print(total)[1] 5050
4 Is it helpful?
I think it is a useful function for later on when parsing through longer datasets. I’m currently using it in my midterm as my midterm has almost 9,000 observations so when parsing it takes time and I tend to get distracted.