class: left bottom hide-count background-image: url(assets/images/bench-accounting-xxeAftHHq6E-unsplash.jpg) background-size: 100% <div class="talk-logo drake-logo"></div> .talk-meta[ .talk-title[ # Reproducible Data Workflows ] .talk-author[ Garrick Aden-Buie ] .talk-date[ July 19th, 2019 ] ] --- class: top hide-count .f1.pt4[ <img class="icon-huge pr3" src="assets/images/cloud-upload.svg" width="100px"/> `rstudio.cloud/project/405721` ] .f1[ <img class="icon-huge pr3" src="assets/images/desktop-download.svg" width="100px"/> .code.moffitt-blue[usethis::use_zip(".moffitt-orange[github 👇]")] ] .f1[ <img class="icon-huge pr3" src="assets/images/mark-github.svg" width="100px"/> [github.com/.moffitt-orange[gadenbuie/drake-intro]](https://github.com/gadenbuie/drake-intro) ] .f2.silver.center.mt5[Find someone to sit next to and share laptops] --- class: inverse center middle hide-count ## What is drake? -- ![](assets/images/drake-meme.jpg) --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-02.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-03.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-04.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-05.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-06.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-07.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-08.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-09.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-10.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-11.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-12.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-13.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-14.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-15.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-16.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-17.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-18.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-19.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-20.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-21.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-22.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-23.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-24.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-25.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-26.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-27.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-28.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-29.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-30.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-31.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/messy-workflow-32.jpg') background-size: contain --- class: middle background-image: url('assets/images/drawn/messy-workflow-32.jpg') background-size: contain .f2.moffitt-bg-light-blue-o90.white.pa5.shadow-3[ * Will this work when I come back to it later? * What happens if I re-run _everything_? * Am I certain that the results are still valid? ] --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/project-spectrum-33.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/project-spectrum-34.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/project-spectrum-35.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/project-spectrum-36.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-37.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-38.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-39.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-40.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-41.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-42.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-43.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-44.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-45.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-46.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-47.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-48.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-49.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-50.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-51.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-52.jpg') background-size: contain --- class: hide-count animated fadeIn background-image: url('assets/images/drawn/drake-intro-53.jpg') background-size: contain --- layout: true class: animated fadeIn background-image: url('assets/images/drake-infographic.svg') background-size: 100% background-position: 40% left --- .bg-white.h-100.w-80.fixed.o-90.cover-2-3[ <!-- Cover left two items of drake infographic --> ] --- .bg-white.h-100.w-80.fixed.o-90.cover-3[ <!-- Cover left items of drake infographic --> ] --- <!-- Full drake infographic --> --- layout: false class: center middle moffitt-bg-blue inverse hide-count <img class="icon-huge pr3" src="assets/images/noun_detour.svg" width="200px"/> # Detour — Functions --- layout: true class: animated fadeIn --- .f1.mt4.pa4.code[ .o-0[verb <- .dark-gray[function(].moffitt-bg-blue.moffitt-blue.pa3.o-0[x, y = 0, ...]) {] .moffitt-green.mh3.pv1.ph1[ x_range <- max(x) - min(x) (x - min(x)) / x_range + y ] .o-0.dark-gray[}] ] --- .f1.mt4.pa4.code.dark-gray[ .o-0[verb <- function(.moffitt-bg-blue.moffitt-blue.pa3.o-0[x, y = 0, ...]) {] .moffitt-bg-green.moffitt-green.mh3.pv1.ph1[ x_range <- max(x) - min(x) (x - min(x)) / x_range + y ] .o-0[}] ] --- .f1.mt4.pa4.code.dark-grey[ verb <- function() { .moffitt-bg-green.moffitt-green.mh3.pv1.ph1.o-60[ x_range <- max(x) - min(x) (x - min(x)) / x_range + y ] } ] --- .f1.mt4.pa4.code.dark-gray[ verb <- function(.moffitt-bg-blue.moffitt-blue.pa3[x, y = 0, ...]) { .moffitt-bg-green.moffitt-green.mh3.pv1.ph1.o-40[ x_range <- max(x) - min(x) (x - min(x)) / x_range + y ] } ] --- .f1.mt4.pa4.code.dark-gray[ verb <- function(.moffitt-bg-blue.moffitt-blue.pa3[x, y = 0, ...]) { .moffitt-bg-green-o10.moffitt-green.mh3.pv1.ph1[ x_range <- max(x) - min(x) (x - min(x)) / x_range + y ] } ] --- .f1.mt4.pa4.code.dark-gray[ verb <- function(.moffitt-bg-blue-o20.moffitt-blue.pa3[x, y = 0, ...]) { .moffitt-bg-green-o10.moffitt-green.mh3.pv1.ph1[ x_range <- max(.moffitt-blue[x]) - min(.moffitt-blue[x]) (.moffitt-blue[x] - min(.moffitt-blue[x])) / x_range + .moffitt-blue[y] ] } ] --- .f1.mt4.pa4.code.dark-gray[ verb <- function(.moffitt-bg-blue-o20.moffitt-blue.pa3[x, y = 0, ...]) { .moffitt-bg-green-o10.moffitt-green.mh3.pv1.ph1[ x_range <- max(.moffitt-blue[x]) - min(.moffitt-blue[x]) .moffitt-bg-orange-o20[(.moffitt-blue[x] - min(.moffitt-blue[x])) / x_range + .moffitt-blue[y]] ] } ] --- .f1.mt4.pa4.code.dark-gray[ verb <- function(.moffitt-bg-blue-o20.moffitt-blue.pa3[x, y = 0, ...]) { .moffitt-bg-green-o10.moffitt-green.mh3.pv1.ph1[ x_range <- max(.moffitt-blue[x]) - min(.moffitt-blue[x]) .moffitt-bg-orange-o20[.moffitt-orange[return(](.moffitt-blue[x] - min(.moffitt-blue[x])) / x_range + .moffitt-blue[y].moffitt-orange[)]] ] } ] --- .f1.mt4.pa4.code.dark-gray.moffitt-bg-grey-o20.ba.b--dotted.bw2.b--dark-gray.border-box[ verb <- function(.moffitt-bg-blue.moffitt-blue.pa3.o-30[x, y = 0, ...]) { .moffitt-bg-green.moffitt-green.mh3.pv1.ph1.o-40[ x_range <- max(x) - min(x) (x - min(x)) / x_range + y ] } ] --- .f1.mt4.pa4.code.dark-gray.moffitt-bg-grey-o20[ verb <- function(.moffitt-bg-blue-o20.moffitt-blue.pa3[x, y = 0, ...]) { .moffitt-bg-green-o10.moffitt-green.mh3.pv1.ph1[ x_range <- max(.moffitt-blue[x]) - min(.moffitt-blue[x]) (.moffitt-blue[x] - min(.moffitt-blue[x])) / x_range + .moffitt-blue[y] ] } ] --- .f1.mt4.pa4.code.dark-gray.moffitt-bg-grey-o20[ scale <- function(.moffitt-bg-blue-o20.moffitt-blue.pa3[x, y = 0, ...]) { .moffitt-bg-green-o10.moffitt-green.mh3.pv1.ph1[ x_range <- max(.moffitt-blue[x]) - min(.moffitt-blue[x]) (.moffitt-blue[x] - min(.moffitt-blue[x])) / x_range + .moffitt-blue[y] ] } ] --- .f1.mt4.pa4.code.dark-gray.moffitt-bg-grey-o20[ scale <- function(.moffitt-bg-blue-o20.moffitt-blue.pa3[x, y = 0, .moffitt-red[...]]) { .moffitt-bg-green-o10.moffitt-green.mh3.pv1.ph1[ x_r <- max(.moffitt-blue[x], .moffitt-red[...]) - min(.moffitt-blue[x], .moffitt-red[...]) (.moffitt-blue[x] - min(.moffitt-blue[x], .moffitt-red[...])) / x_r + .moffitt-blue[y] ] } ] --- layout: false ## Function Review * Functions take **inputs** .muted[(also called .code[formals()])] -- * ... use the inputs in the function **.code[body()]** -- * ... or find variables within their **scope** .muted[(or .code[environment()])] -- * ... and **.code[return()]** a value --- ## Your Turn! The formula for converting Celsius to Fahrenheit is .center[ <span style="font-size: 100%; display: inline-block; position: relative;" id="MathJax-Element-1-Frame" tabindex="0" role="presentation" class=""><span class="MJX_Assistive_MathML MJX_Assistive_MathML_Block" role="presentation"><math xmlns="http://www.w3.org/1998/Math/MathML" display="block"><msub><mi>T</mi><mi>F</mi></msub><mo>=</mo><mfrac><mn>9</mn><mn>5</mn></mfrac><msub><mi>T</mi><mi>C</mi></msub><mo>+</mo><mn>32</mn></math></span></span> ] * Write a function that converts °C to °F. What is 20°C? .muted.w-60[ Extra challenge: * Your friend is Canadian and keeps sharing their local weather in °C. Write a function that decides if it's hot or not based on the infinitely more reasonable threshold of 78°F. * 78°F is a debatable, make it a parameter ]
02
:
00
??? Check with someone else. What is 20°C? It should be 68°F --- ## Function Style Guide .f2.pt4[ * Clear **names** like `verb()` or `verb_thing()` ] -- .f2.pt4[ * **Short** functions that do **one**.muted[-ish] thing ] -- .f2.pt4[ * Limit **side-effects** or be explicit .silver[e.g. .code[output = ""]] ] --- layout: true ## Fail or Return Early --- Check your inputs at the top of your function. For type checking you can use ```r verb <- function(x) { stopifnot(is.numeric(x)) # ... more code } ``` -- or for more control over error messages ```r verb <- function(x) { if (x < 0) { stop("`x` must be non-negative") } } ``` --- Return early from functions to minimize `if ... else` indenting .absolute.pa3.red.bg-washed-red.bottom-2.right-0.w-40.f2[☹] ```r verb <- function(df) { # check that data is valid if (nrow(df) > 0) { # process the valid data answer <- df %>% # ... many # ... lines # ... of code } else { # or give back a default value answer <- FALSE } answer } ``` --- Return early from functions to minimize `if ... else` indenting ```r verb <- function(df) { # check that data is valid if (nrow(df) == 0) { return(FALSE) } # ... process the valid data } ``` --- layout: false ## Pass the Dots Use the power of the .moffitt-red.moffitt-bg-red-o10.code[...] with the .pkg[tidyverse] ```r grouped_mean <- function(df, ...) { df %>% group_by(...) %>% summarize(mean_expr = mean(expr)) } ``` .footnote[ Jenny Bryan: [Lazy Evaluation](https://resources.rstudio.com/rstudio-conf-2019/lazy-evaluation), rstudio::conf(2019) Sharla Gelfand: [tidyeval](https://sharla.party/posts/tidyeval/), sharla.party ] --- class: middle center moffitt-bg-light-blue white hide-count ## Your Turn Again Update your temperature conversion function to reject non-numeric, unreasonable inputs
02
:
00
--- exclude: true .mt1.pa4.code.mw7[ verb <- .dark-gray[function(].moffitt-bg-blue.moffitt-blue.pa3[x, y = 0, ...]) { .moffitt-bg-green.moffitt-green.mh3.pv1.ph4[ x_range <- max(x) - min(x) (x - min(x)) / x_range + y ] .dark-gray[}] ] --- class: inverse center middle ## An Example drake Project --- layout: true .center[ ## .code[rstudio.cloud/project/405721] #### .code.gray[usethis::use_course("gadenbuie/drake-intro")] ] <hr /> .muted[Let's code together...] --- * Open .moffitt-bg-grey-o20.moffitt-blue.code.pa2[drake.R] * What packages and functions are loaded? * Walk through .moffitt-bg-grey-o20.moffitt-blue.code.pa2[R/plan.R] * Preview the work plan * Run the plan --- * Use .moffitt-bg-grey-o20.moffitt-red.code.pa2[readd()] and .moffitt-bg-grey-o20.moffitt-red.code.pa2[loadd()] to view results * View the .moffitt-bg-grey-o20.moffitt-blue.code.pa2[report.html] * How are _targets_ used in .moffitt-bg-grey-o20.moffitt-blue.code.pa2[report.Rmd]? -- * How does .pkg[drake] track dependencies in R Markdown? -- * Do we need to run .moffitt-bg-grey-o20.moffitt-blue.code.pa2[make.R] to be able to update .moffitt-bg-grey-o20.moffitt-blue.code.pa2[report.Rmd]? --- The life expectancy plot currently only shows results for Tampa. * Modify .moffitt-bg-grey-o20.code.pa2[plot_life_exp_gender_income()] in .moffitt-bg-grey-o20.moffitt-blue.code.pa2[R/functions.R] to have a `czname` argument. * Update the .moffitt-bg-grey-o20.moffitt-blue.code.pa2[R/plan.R] to plot .b[Denver]. * Add .b[Denver] to the .moffitt-bg-grey-o20.moffitt-blue.code.pa2[report.Rmd] to compare with .b[Tampa].
05
:
00
--- .muted[... if there is time] * Convert the code in .moffitt-bg-grey-o20.moffitt-blue.code.pa2[R/scratch.R] into a function .moffitt-bg-grey-o20.code.pa2[plot_life_exp_income()] and add it to the .moffitt-bg-grey-o20.moffitt-blue.code.pa2[R/plan.R]. --- layout: false ## But wait, there's more! There is a lot more that .pkg[drake] can do, including: 1. [Predict total runtime](https://ropenscilabs.github.io/drake-manual/time.html#predict-total-runtime) of your plan ```r predict_runtime(config) ``` 1. Parameterized targets for hyperparameter selection in [large plans](https://ropenscilabs.github.io/drake-manual/plans.html#large-plans) 1. [Parallel computation](https://ropenscilabs.github.io/drake-manual/hpc.html#parallel-backends) of targets 1. Distribute and run targets on [HPC clusters](https://ropenscilabs.github.io/drake-manual/hpc.html) --- ### Learn More with These Resources .flex[ .w-40.ph4[ Drake * [User Manual](https://ropenscilabs.github.io/drake-manual/) * [Package Docs](https://ropensci.github.io/drake/) * [ropensci/drake](https://github.com/ropensci/drake) ] .w-60.ph4[ Functions * [Advanced R: Functions](https://adv-r.hadley.nz/functions.html) * [Functions - Nice R Code](https://nicercode.github.io/guides/functions/) * [Programming with R: Creating Functions](https://swcarpentry.github.io/r-novice-inflammation/02-func-R/) ] ] .footnote[ **Icons** by [Github octicons](https://octicons.github.com/) and the following [Noun Project](https://thenounproject.com/search/?q=Report&i=180805) icon creators: [Kirby Wu, TW](https://thenounproject.com/search/?q=json&i=966215), [Iga](https://thenounproject.com/search/?q=document&i=2711779), [Lil Squid](https://thenounproject.com/search/?q=report&i=149914), [Wichai Wi](https://thenounproject.com/search/?q=results&i=2294590), [Nick Kinling](https://thenounproject.com/search/?q=detour&i=788361). **Built with** [rmarkdown](https://rmarkdown.rstudio.com), [xaringan](https://slides.yihui.name/xaringan), [xaringanthemer](https://pkg.garrickadenbuie.com/xaringanthemer), [remark.js](http://remarkjs.com/), [tachyons.css](http://tachyons.io/), [animate.css](https://daneden.github.io/animate.css/) ] <style type="text/css"> .talk-logo { width: 200px; height: 231px; position: absolute; top: 25%; left: 12%; } .drake-logo { background-image: url('assets/images/drake-logo.svg'); background-size: cover; background-repeat: no-repeat; } .talk-meta { font-family: Overpass; position: absolute; text-align: left; bottom: 10px; left: 25px; } .talk-author { color: #444; font-weight: bold; font-size: 1.5em; line-height: 1em; } .talk-date { color: #666; font-size: 1.25em; line-height: 0; } .icon-huge { position: relative; top: 35px; } .cover-2-3 { left: 33%; } .cover-3 { left: 66%; } </style>