class: center, middle, inverse, title-slide # A Gentle Guide to the Grammar of Graphics
with
ggplot2
## Tampa R Users Meetup ### Garrick Aden-Buie
@grrrck
### 2018-01-23
Follow along:
bit.ly/trug-ggplot2
--- class: fullscreen, inverse, top, center, text-white background-image: url("images/letter-g.jpg") .font150[**Brought to you by the letter...**] --- layout: true # Why *ggplot2*? --- .left-column[ ![](images/hadley.jpg) __Hadley Wickham__ ] .right-column[.font150[ The transferrable skills from ggplot2 are not the idiosyncracies of plotting syntax, but a powerful way of thinking about visualisation, as a way of **mapping between variables and the visual properties of geometric objects** that you can perceive. ] .footnote[<http://disq.us/p/sv640d>] ] --- ## My personal reasons - .hl[Functional] data visualization 1. Wrange data 2. Map data to visual elements 3. Tweak scales, guides, axis, labels, theme - Easy to .hl[reason] about how data drives visualization - Easy to .hl[iterate] - Easy to be .hl[consistent] --- layout: false # What are we getting into? <br> `ggplot2` is a huge package: philosophy + functions <br>...but it's very well organized -- <br><br> *Lots* of examples of not-so-great plots in these slides <br>...but that's okay -- <br><br> Going to throw a lot at you <br>...but you'll know *where* and *what* to look for -- .img-right[![](images/poppins-bag.gif)] -- .img-right[![](images/poppins-bag-kids.gif)] --- layout: true # G is for getting started --- **Easy**: install the [tidyverse](http://tidyverse.org) ```r install.packages('tidyverse') ``` **Medium**: install just `ggplot2` ```r install.pacakages('ggplot2') ``` **Expert**: install from GitHub ```r devtools::install_github('tidyverse/ggplot2') ``` --- ## Load the tidyverse ```r library(tidyverse) ``` ``` ## ── Attaching packages ────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ── ``` ``` ## ✔ ggplot2 2.2.1 ✔ purrr 0.2.4 ## ✔ tibble 1.4.1 ✔ dplyr 0.7.4 ## ✔ tidyr 0.7.2 ✔ stringr 1.2.0 ## ✔ readr 1.1.1 ✔ forcats 0.2.0 ``` ``` ## ── Conflicts ───────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ── ## ✖ dplyr::filter() masks stats::filter() ## ✖ dplyr::lag() masks stats::lag() ``` --- ## Other packages you'll need for this adventure ```r library(lubridate) # tidyverse library(reshape2) # install.packages("reshape2") library(babynames) # install.packages("babynames") ``` --- layout: true # gg is for Grammar of Graphics .left-column[ ### Data ```r g <- ggplot() ``` ] --- .right-column[ #### Tidy Data 1. Each variable forms a .hl[column] 2. Each observation forms a .hl[row] 3. Each observational unit forms a table <br><br>The following example draws from ```r data(population, package = "tidyr") ``` ] --- .right-column[ <table> <thead> <tr> <th style="text-align:left;"> country </th> <th style="text-align:right;"> 1995 </th> <th style="text-align:right;"> 2000 </th> <th style="text-align:right;"> 2005 </th> <th style="text-align:right;"> 2010 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Canada </td> <td style="text-align:right;"> 29.2949 </td> <td style="text-align:right;"> 30.69742 </td> <td style="text-align:right;"> 32.25309 </td> <td style="text-align:right;"> 34.12624 </td> </tr> <tr> <td style="text-align:left;"> China </td> <td style="text-align:right;"> 1237.5314 </td> <td style="text-align:right;"> 1280.42858 </td> <td style="text-align:right;"> 1318.17683 </td> <td style="text-align:right;"> 1359.82146 </td> </tr> <tr> <td style="text-align:left;"> USA </td> <td style="text-align:right;"> 268.0397 </td> <td style="text-align:right;"> 284.59440 </td> <td style="text-align:right;"> 298.16580 </td> <td style="text-align:right;"> 312.24712 </td> </tr> </tbody> </table> <br> <table> <thead> <tr> <th style="text-align:right;"> year </th> <th style="text-align:right;"> Canada </th> <th style="text-align:right;"> China </th> <th style="text-align:right;"> USA </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1995 </td> <td style="text-align:right;"> 29.29490 </td> <td style="text-align:right;"> 1237.531 </td> <td style="text-align:right;"> 268.0397 </td> </tr> <tr> <td style="text-align:right;"> 2000 </td> <td style="text-align:right;"> 30.69742 </td> <td style="text-align:right;"> 1280.429 </td> <td style="text-align:right;"> 284.5944 </td> </tr> <tr> <td style="text-align:right;"> 2005 </td> <td style="text-align:right;"> 32.25309 </td> <td style="text-align:right;"> 1318.177 </td> <td style="text-align:right;"> 298.1658 </td> </tr> <tr> <td style="text-align:right;"> 2010 </td> <td style="text-align:right;"> 34.12624 </td> <td style="text-align:right;"> 1359.821 </td> <td style="text-align:right;"> 312.2471 </td> </tr> </tbody> </table> ] --- .right-column[ ```r tidy1 <- gather(messy1, 'year', 'population', -country) ``` <table> <thead> <tr> <th style="text-align:left;"> country </th> <th style="text-align:left;"> year </th> <th style="text-align:right;"> population </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Canada </td> <td style="text-align:left;"> 1995 </td> <td style="text-align:right;"> 29.295 </td> </tr> <tr> <td style="text-align:left;"> China </td> <td style="text-align:left;"> 1995 </td> <td style="text-align:right;"> 1237.531 </td> </tr> <tr> <td style="text-align:left;"> USA </td> <td style="text-align:left;"> 1995 </td> <td style="text-align:right;"> 268.040 </td> </tr> <tr> <td style="text-align:left;"> Canada </td> <td style="text-align:left;"> 2000 </td> <td style="text-align:right;"> 30.697 </td> </tr> <tr> <td style="text-align:left;"> China </td> <td style="text-align:left;"> 2000 </td> <td style="text-align:right;"> 1280.429 </td> </tr> <tr> <td style="text-align:left;"> USA </td> <td style="text-align:left;"> 2000 </td> <td style="text-align:right;"> 284.594 </td> </tr> <tr> <td style="text-align:left;"> Canada </td> <td style="text-align:left;"> 2005 </td> <td style="text-align:right;"> 32.253 </td> </tr> <tr> <td style="text-align:left;"> China </td> <td style="text-align:left;"> 2005 </td> <td style="text-align:right;"> 1318.177 </td> </tr> <tr> <td style="text-align:left;"> USA </td> <td style="text-align:left;"> 2005 </td> <td style="text-align:right;"> 298.166 </td> </tr> <tr> <td style="text-align:left;"> Canada </td> <td style="text-align:left;"> 2010 </td> <td style="text-align:right;"> 34.126 </td> </tr> <tr> <td style="text-align:left;"> China </td> <td style="text-align:left;"> 2010 </td> <td style="text-align:right;"> 1359.821 </td> </tr> <tr> <td style="text-align:left;"> USA </td> <td style="text-align:left;"> 2010 </td> <td style="text-align:right;"> 312.247 </td> </tr> </tbody> </table> ] --- .right-column[ ```r tidy2 <- gather(messy2, 'country', 'population', -year) ``` <table> <thead> <tr> <th style="text-align:right;"> year </th> <th style="text-align:left;"> country </th> <th style="text-align:right;"> population </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1995 </td> <td style="text-align:left;"> Canada </td> <td style="text-align:right;"> 29.295 </td> </tr> <tr> <td style="text-align:right;"> 2000 </td> <td style="text-align:left;"> Canada </td> <td style="text-align:right;"> 30.697 </td> </tr> <tr> <td style="text-align:right;"> 2005 </td> <td style="text-align:left;"> Canada </td> <td style="text-align:right;"> 32.253 </td> </tr> <tr> <td style="text-align:right;"> 2010 </td> <td style="text-align:left;"> Canada </td> <td style="text-align:right;"> 34.126 </td> </tr> <tr> <td style="text-align:right;"> 1995 </td> <td style="text-align:left;"> China </td> <td style="text-align:right;"> 1237.531 </td> </tr> <tr> <td style="text-align:right;"> 2000 </td> <td style="text-align:left;"> China </td> <td style="text-align:right;"> 1280.429 </td> </tr> <tr> <td style="text-align:right;"> 2005 </td> <td style="text-align:left;"> China </td> <td style="text-align:right;"> 1318.177 </td> </tr> <tr> <td style="text-align:right;"> 2010 </td> <td style="text-align:left;"> China </td> <td style="text-align:right;"> 1359.821 </td> </tr> <tr> <td style="text-align:right;"> 1995 </td> <td style="text-align:left;"> USA </td> <td style="text-align:right;"> 268.040 </td> </tr> <tr> <td style="text-align:right;"> 2000 </td> <td style="text-align:left;"> USA </td> <td style="text-align:right;"> 284.594 </td> </tr> <tr> <td style="text-align:right;"> 2005 </td> <td style="text-align:left;"> USA </td> <td style="text-align:right;"> 298.166 </td> </tr> <tr> <td style="text-align:right;"> 2010 </td> <td style="text-align:left;"> USA </td> <td style="text-align:right;"> 312.247 </td> </tr> </tbody> </table> ] --- layout: true # gg is for Grammar of Graphics .left-column[ ### Data ### Aesthetics ```r g + aes() ``` ] --- .right-column[ Map data to visual elements or parameters - year - population - country ] --- .right-column[ Map data to visual elements or parameters - year → **x** - population → **y** - country → *shape*, *color*, etc. ] --- layout: true # gg is for Grammar of Graphics .left-column[ ### Data ### Aesthetics ### Geoms ```r g + geom_*() ``` ] --- Geometric objects displayed on the plot: .font80[ | Type | Function | |:----:|:--------:| | Point | `geom_point()` | | Line | `geom_line()` | | Bar | `geom_bar()`, `geom_col()` | | Histogram | `geom_histogram()` | | Regression | `geom_smooth()` | | Boxplot | `geom_boxplot()` | | Text | `geom_text()` | | Vert./Horiz. Line | `geom_{vh}line()` | | Count | `geom_count()` | | Density | `geom_density()` | ] --- .right-column[ Those are just the [top 10 most popular geoms](https://eric.netlify.com/2017/08/10/most-popular-ggplot2-geoms/)<sup>1</sup> See <http://ggplot2.tidyverse.org/reference/> for many more options Or just start typing `geom_` in RStudio .font70[ ``` ## [1] "geom_abline" "geom_area" "geom_bar" "geom_bin2d" ## [5] "geom_blank" "geom_boxplot" "geom_col" "geom_contour" ## [9] "geom_count" "geom_crossbar" "geom_curve" "geom_density" ## [13] "geom_density_2d" "geom_density2d" "geom_dotplot" "geom_errorbar" ## [17] "geom_errorbarh" "geom_freqpoly" "geom_hex" "geom_histogram" ## [21] "geom_hline" "geom_jitter" "geom_label" "geom_line" ## [25] "geom_linerange" "geom_map" "geom_path" "geom_point" ## [29] "geom_pointrange" "geom_polygon" "geom_qq" "geom_quantile" ## [33] "geom_raster" "geom_rect" "geom_ribbon" "geom_rug" ## [37] "geom_segment" "geom_smooth" "geom_spoke" "geom_step" ## [41] "geom_text" "geom_tile" "geom_violin" "geom_vline" ``` ]] .footnote[[1] <https://eric.netlify.com/2017/08/10/most-popular-ggplot2-geoms/>] --- layout: true # Our first plot! --- .left-code[ ```r ggplot(tidy1) ``` ] .right-plot[ <img src="index_files/figure-html/first-plot1a-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tidy1) + * aes(x = year, * y = population) ``` ] .right-plot[ <img src="index_files/figure-html/first-plot1b-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tidy1) + aes(x = year, y = population) + * geom_point() ``` ] .right-plot[ <img src="index_files/figure-html/first-plot1c-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tidy1) + aes(x = year, y = population, * color = country) + geom_point() ``` ] .right-plot[ <img src="index_files/figure-html/first-plot1-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tidy1) + aes(x = year, y = population, color = country) + geom_point() + * geom_line() ``` .font80[ ```r geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic? ``` ] ] .right-plot[ <img src="index_files/figure-html/first-plot2-fake-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tidy1) + aes(x = year, y = population, color = country) + geom_point() + geom_line( * aes(group = country)) ``` ] .right-plot[ <img src="index_files/figure-html/first-plot2-out-1.png" width="100%" /> ] --- layout: true # gg is for Grammar of Graphics .left-column[ ### Data ### Aesthetics ### Geoms ```r g + geom_*() ``` ] --- .right-column[ ```r geom_*(mapping, data, stat, position) ``` - `data` Geoms can have their own data - Has to map onto global coordinates - `map` Geoms can have their own aesthetics - Inherits global aesthetics - Have geom-specific aesthetics - `geom_point` needs `x` and `y`, optional `shape`, `color`, `size`, etc. - `geom_ribbon` requires `x`, `ymin` and `ymax`, optional `fill` - `?geom_ribbon` ] --- .right-column[ ```r geom_*(mapping, data, stat, position) ``` - `stat` Some geoms apply further transformations to the data - All respect `stat = 'identity'` - Ex: `geom_histogram` uses `stat_bin()` to group observations - `position` Some adjust location of objects - `'dodge'`, `'stack'`, `'jitter'` ] --- layout: true # Example: Stat and Position --- .pull-left[ #### Star Wars Characters .font90[ ```r sw_chars <- starwars %>% mutate( n_movies = map_int(films, length), gender = ifelse( !gender %in% c('female', 'male'), 'other', gender) ) %>% select(name, gender, n_movies) ``` ]] .pull-right[ <table> <thead> <tr> <th style="text-align:left;"> name </th> <th style="text-align:left;"> gender </th> <th style="text-align:right;"> n_movies </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Luke Skywalker </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> C-3PO </td> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 6 </td> </tr> <tr> <td style="text-align:left;"> R2-D2 </td> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 7 </td> </tr> <tr> <td style="text-align:left;"> Darth Vader </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> Leia Organa </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Owen Lars </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Beru Whitesun lars </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> R5-D4 </td> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Biggs Darklighter </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Obi-Wan Kenobi </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 6 </td> </tr> <tr> <td style="text-align:left;"> Anakin Skywalker </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Wilhuff Tarkin </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Chewbacca </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Han Solo </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:left;"> Greedo </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Jabba Desilijic Tiure </td> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Wedge Antilles </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Jek Tono Porkins </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Yoda </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Palpatine </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 5 </td> </tr> <tr> <td style="text-align:left;"> Boba Fett </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> IG-88 </td> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Bossk </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Lando Calrissian </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Lobot </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Ackbar </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Mon Mothma </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Arvel Crynyd </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Wicket Systri Warrick </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Nien Nunb </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Qui-Gon Jinn </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Nute Gunray </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Finis Valorum </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Jar Jar Binks </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Roos Tarpals </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Rugor Nass </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Ric Olié </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Watto </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Sebulba </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Quarsh Panaka </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Shmi Skywalker </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Darth Maul </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Bib Fortuna </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Ayla Secura </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Dud Bolt </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Gasgano </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Ben Quadinaros </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Mace Windu </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Ki-Adi-Mundi </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Kit Fisto </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Eeth Koth </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Adi Gallia </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Saesee Tiin </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Yarael Poof </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Plo Koon </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:left;"> Mas Amedda </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Gregar Typho </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Cordé </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Cliegg Lars </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Poggle the Lesser </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Luminara Unduli </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Barriss Offee </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Dormé </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Dooku </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Bail Prestor Organa </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Jango Fett </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Zam Wesell </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Dexter Jettster </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Lama Su </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Taun We </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Jocasta Nu </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Ratts Tyerell </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> R4-P17 </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Wat Tambor </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> San Hill </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Shaak Ti </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Grievous </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Tarfful </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Raymus Antilles </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Sly Moore </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:left;"> Tion Medon </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Finn </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Rey </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Poe Dameron </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> BB8 </td> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Captain Phasma </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:left;"> Padmé Amidala </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 3 </td> </tr> </tbody> </table> ] --- .left-code[ ```r ggplot(sw_chars) + aes(x = n_movies) + geom_bar(stat = "count") ``` ] .right-plot[ <img src="index_files/figure-html/stat-example-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(sw_chars) + aes(x = n_movies, * fill = gender) + geom_bar(stat = "count") ``` ] .right-plot[ <img src="index_files/figure-html/stat-example2-out-1.png" width="100%" /> ] --- .pull-left[.font90[ ```r sw_chars_id <- sw_chars %>% group_by(n_movies, gender) %>% tally ``` ]] .pull-right[ <table> <thead> <tr> <th style="text-align:right;"> n_movies </th> <th style="text-align:left;"> gender </th> <th style="text-align:right;"> n </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 9 </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 34 </td> </tr> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 6 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 12 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 9 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> female </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> male </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:right;"> 7 </td> <td style="text-align:left;"> other </td> <td style="text-align:right;"> 1 </td> </tr> </tbody> </table> ] --- .left-code[ ```r ggplot(sw_chars_id) + aes(x = n_movies, y = n, fill = gender) + * geom_bar(stat = 'identity') ``` .font80[Note: `geom_col()` is alias for <br>`geom_bar(stat = 'identity')` ] ] .right-plot[ <img src="index_files/figure-html/stat-example4-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(sw_chars_id) + aes(x = n_movies, y = n, fill = gender) + * geom_col(position = "fill") ``` ] .right-plot[ <img src="index_files/figure-html/stat-example5-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(sw_chars_id) + aes(x = n_movies, y = n, fill = gender) + * geom_col(position = "dodge") ``` ] .right-plot[ <img src="index_files/figure-html/stat-example3-out-1.png" width="100%" /> ] --- layout: false exclude: true # Stat and position are functions too #### Stat transformations ``` ## [1] "stat_bin" "stat_bin_2d" "stat_bin_hex" "stat_bin2d" ## [5] "stat_binhex" "stat_boxplot" "stat_contour" "stat_count" ## [9] "stat_density" "stat_density_2d" "stat_density2d" "stat_ecdf" ## [13] "stat_ellipse" "stat_function" "stat_identity" "stat_qq" ## [17] "stat_quantile" "stat_smooth" "stat_spoke" "stat_sum" ## [21] "stat_summary" "stat_summary_2d" "stat_summary_bin" "stat_summary_hex" ## [25] "stat_summary2d" "stat_unique" "stat_ydensity" ``` #### Position transformations ``` ## [1] "position_dodge" "position_fill" "position_identity" ## [4] "position_jitter" "position_jitterdodge" "position_nudge" ## [7] "position_stack" ``` --- layout: true # gg is for Grammar of Graphics .left-column[ ### Data ### Aesthetics ### Geoms ### Facet ```r g+facet_wrap() g+facet_grid() ``` ] --- .right-column[ ```r g <- ggplot(sw_chars) + aes(x = n_movies, fill = gender) + geom_bar() ``` ] --- .right-column[ ```r g + facet_wrap(~ gender) ``` <img src="index_files/figure-html/unnamed-chunk-3-1.png" width="75%" /> ] --- .right-column[ ```r g + facet_grid(gender ~ hair_color) ``` <img src="index_files/figure-html/unnamed-chunk-4-1.png" width="75%" /> ] --- .right-column[ ```r g + facet_grid(gender ~ hair_color, scales = 'free_y') ``` <img src="index_files/figure-html/unnamed-chunk-5-1.png" width="75%" /> ] --- layout: true # gg is for Grammar of Graphics .left-column[ ### Data ### Aesthetics ### Geoms ### Facet ### Labels ```r g + labs() ``` ] --- .right-column[ ```r g <- g + labs( x = "Film Appearances", y = "Count of Characters", title = "Recurring Star Wars Characters", subtitle = "How often do characters appear?", fill = "Gender" ) ``` ] --- .right-column[ <img src="index_files/figure-html/unnamed-chunk-7-1.png" width="90%" /> ] --- layout: true # gg is for Grammar of Graphics .left-column[ ### Data ### Aesthetics ### Geoms ### Facet ### Labels ### Scales ```r g + scale_*_*() ``` ] --- .right-column[ `scale` + `_` + `<aes>` + `_` + `<type>` + `()` What parameter do you want to adjust? → `<aes>` <br> What type is the parameter? → `<type>` - I want to change my discrete x-axis<br>`scale_x_discrete()` - I want to change point size from continuous variable<br>`scale_size_continuous()` - I want to rescale y-axis as log<br>`scale_y_log10()` - I want to use a different color palette<br>`scale_fill_discrete()`<br>`scale_color_manual()` ] --- .right-column[ ```r g <- g + scale_fill_brewer(palette = 'Set1') ``` <img src="index_files/figure-html/unnamed-chunk-9-1.png" width="90%" /> ] --- layout: true # gg is for Grammar of Graphics .left-column[ ### Data ### Aesthetics ### Geoms ### Facet ### Labels ### Scales ### Theme ```r g + theme() ``` ] --- .right-column[ Change the appearance of plot decorations<br> i.e. things that aren't mapped to data A few "starter" themes ship with the package - `g + theme_bw()` - `g + theme_dark()` - `g + theme_gray()` - `g + theme_light()` - `g + theme_minimal()` ] --- .right-column[ Huge number of parameters, grouped by plot area: - Global options: `line`, `rect`, `text`, `title` - `axis`: x-, y- or other axis title, ticks, lines - `legend`: Plot legends - `panel`: Actual plot area - `plot`: Whole image - `strip`: Facet labels ] --- .right-column[ Theme options are supported by helper functions: - `element_blank()` removes the element - `element_line()` - `element_rect()` - `element_text()` ] --- .right-column[ ```r g + theme_bw() ``` <img src="index_files/figure-html/unnamed-chunk-10-1.png" width="90%" /> ] --- .right-column[ .font80[ ```r g + theme_minimal() + theme(text = element_text(family = "Palatino")) ``` <img src="index_files/figure-html/unnamed-chunk-11-1.png" width="90%" /> ] ] --- .right-column[ You can also set the theme globally with `theme_set()` ```r my_theme <- theme_bw() + theme( text = element_text(family = "Palatino", size = 12), panel.border = element_rect(colour = 'grey80'), panel.grid.minor = element_blank() ) theme_set(my_theme) ``` ] --- .right-column[ ```r g ``` <img src="index_files/figure-html/unnamed-chunk-12-1.png" width="90%" /> ] --- .right-column[ ```r g + theme(legend.position = 'bottom') ``` <img src="index_files/figure-html/unnamed-chunk-13-1.png" width="90%" /> ] --- layout: false count: hide class: fullscreen, inverse, top, left, text-white background-image: url(images/super-grover.jpg) .font200[You have the power!] --- class: inverse, center, middle # "Live" Coding ```r data(tips, package = "reshape2") ``` --- # head(tips) <table> <thead> <tr> <th style="text-align:right;"> total_bill </th> <th style="text-align:right;"> tip </th> <th style="text-align:left;"> sex </th> <th style="text-align:left;"> smoker </th> <th style="text-align:left;"> day </th> <th style="text-align:left;"> time </th> <th style="text-align:right;"> size </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 16.99 </td> <td style="text-align:right;"> 1.01 </td> <td style="text-align:left;"> Female </td> <td style="text-align:left;"> No </td> <td style="text-align:left;"> Sun </td> <td style="text-align:left;"> Dinner </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:right;"> 10.34 </td> <td style="text-align:right;"> 1.66 </td> <td style="text-align:left;"> Male </td> <td style="text-align:left;"> No </td> <td style="text-align:left;"> Sun </td> <td style="text-align:left;"> Dinner </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:right;"> 21.01 </td> <td style="text-align:right;"> 3.50 </td> <td style="text-align:left;"> Male </td> <td style="text-align:left;"> No </td> <td style="text-align:left;"> Sun </td> <td style="text-align:left;"> Dinner </td> <td style="text-align:right;"> 3 </td> </tr> <tr> <td style="text-align:right;"> 23.68 </td> <td style="text-align:right;"> 3.31 </td> <td style="text-align:left;"> Male </td> <td style="text-align:left;"> No </td> <td style="text-align:left;"> Sun </td> <td style="text-align:left;"> Dinner </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:right;"> 24.59 </td> <td style="text-align:right;"> 3.61 </td> <td style="text-align:left;"> Female </td> <td style="text-align:left;"> No </td> <td style="text-align:left;"> Sun </td> <td style="text-align:left;"> Dinner </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:right;"> 25.29 </td> <td style="text-align:right;"> 4.71 </td> <td style="text-align:left;"> Male </td> <td style="text-align:left;"> No </td> <td style="text-align:left;"> Sun </td> <td style="text-align:left;"> Dinner </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:right;"> 8.77 </td> <td style="text-align:right;"> 2.00 </td> <td style="text-align:left;"> Male </td> <td style="text-align:left;"> No </td> <td style="text-align:left;"> Sun </td> <td style="text-align:left;"> Dinner </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:right;"> 26.88 </td> <td style="text-align:right;"> 3.12 </td> <td style="text-align:left;"> Male </td> <td style="text-align:left;"> No </td> <td style="text-align:left;"> Sun </td> <td style="text-align:left;"> Dinner </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:right;"> 15.04 </td> <td style="text-align:right;"> 1.96 </td> <td style="text-align:left;"> Male </td> <td style="text-align:left;"> No </td> <td style="text-align:left;"> Sun </td> <td style="text-align:left;"> Dinner </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:right;"> 14.78 </td> <td style="text-align:right;"> 3.23 </td> <td style="text-align:left;"> Male </td> <td style="text-align:left;"> No </td> <td style="text-align:left;"> Sun </td> <td style="text-align:left;"> Dinner </td> <td style="text-align:right;"> 2 </td> </tr> </tbody> </table> --- # tips: tip histogram .left-code[ ```r ggplot(tips) + aes(x = tip) + * geom_histogram( * binwidth = 0.25 * ) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot1-out-1.png" width="100%" /> ] --- layout: true # tips: tip density --- .left-code[ ```r ggplot(tips) + aes(x = tip) + * geom_density( * aes(fill = day) * ) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-density1-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tips) + aes(x = tip) + geom_density( aes(fill = day), * alpha = 0.4 ) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-density2-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tips) + aes(x = tip/total_bill) + geom_density( aes(fill = day) ) + * facet_wrap(~ day) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-density3-out-1.png" width="100%" /> ] --- layout: true # tips: tip vs total --- .left-code[ ```r ggplot(tips) + aes(x = total_bill, * y = tip) + * geom_point() ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total1-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tips) + aes(x = total_bill, y = tip) + geom_point() + * geom_smooth(method = "lm") ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total2-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tips) + aes(x = total_bill, y = tip) + geom_point() + geom_smooth(method = "lm")+ * geom_abline( * slope = c(0.2, 0.15), * intercept = 0, color = c('#69b578', "#dd1144"), linetype = 3) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total3-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tips) + aes(x = total_bill, * y = tip/total_bill) + geom_point() + * geom_hline( yintercept = c(0.2, 0.15), color = c('#69b578', "#dd1144"), linetype = 1) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total4-out-1.png" width="100%" /> ] --- .left-code[ ```r *tips$percent <- * tips$tip/tips$total_bill ggplot(tips) + aes(x = size, * y = percent, * color = smoker) + geom_point() ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total5-out-1.png" width="100%" /> ] --- .left-code[ ```r tips$percent <- tips$tip/tips$total_bill ggplot(tips) + aes(x = size, y = percent, color = smoker) + * geom_jitter(width = 0.25) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total5b-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tips) + aes(x = day, y = percent, color = sex) + geom_jitter(width = 0.25) + * facet_grid(time ~ smoker) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total6-out-1.png" width="100%" /> ] --- .left-code[ ```r tips <- mutate(tips, * time = factor(time, * c("Lunch", "Dinner")), * day = factor(day, * c("Thur", "Fri", * "Sat", "Sun") )) ggplot(tips) + aes(x = day, y = percent, color = sex) + geom_jitter(width = 0.25) + facet_grid(time ~ smoker) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total62-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tips) + aes(x = day, y = percent, fill = time) + * geom_boxplot() + facet_grid(. ~ smoker) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total7-out-1.png" width="100%" /> ] --- .left-code[ ```r ggplot(tips) + aes(x = day, y = percent, * color = smoker, * fill = smoker) + * geom_violin(alpha = 0.3) + facet_wrap(~ smoker) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total8-out-1.png" width="100%" /> ] --- .left-code[ ```r g <- ggplot(tips) + aes(x = day, y = percent, color = smoker, fill = smoker) + geom_violin(alpha = 0.3) + * geom_jitter(alpha = 0.4, * width = 0.25, * size = 0.8)+ facet_wrap(~ smoker) g ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total9-out-1.png" width="100%" /> ] --- .left-code[ ```r g + guides(color = FALSE, fill = FALSE) + labs(x = '', y = 'Tip Rate') + * scale_y_continuous( * labels = scales::percent * ) ``` ] .right-plot[ <img src="index_files/figure-html/tips-plot-total10-out-1.png" width="100%" /> ] --- layout: false class: inverse, center, middle # Level up ```r data(babynames, 'babynames') ``` --- # head(babynames) The [babynames package](https://github.com/hadley/babynames) contains data provided by the USA social security administration: * `babynames`: For each year from 1880 to 2015, the number of children of <br> each sex given each name. All names with more than 5 uses are given. <table> <thead> <tr> <th style="text-align:right;"> year </th> <th style="text-align:left;"> sex </th> <th style="text-align:left;"> name </th> <th style="text-align:right;"> n </th> <th style="text-align:right;"> prop </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1989 </td> <td style="text-align:left;"> F </td> <td style="text-align:left;"> Taunya </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> 0.000 </td> </tr> <tr> <td style="text-align:right;"> 1982 </td> <td style="text-align:left;"> M </td> <td style="text-align:left;"> Zebulin </td> <td style="text-align:right;"> 5 </td> <td style="text-align:right;"> 0.000 </td> </tr> <tr> <td style="text-align:right;"> 1884 </td> <td style="text-align:left;"> M </td> <td style="text-align:left;"> Wheeler </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 0.000 </td> </tr> <tr> <td style="text-align:right;"> 1989 </td> <td style="text-align:left;"> F </td> <td style="text-align:left;"> Neelam </td> <td style="text-align:right;"> 18 </td> <td style="text-align:right;"> 0.000 </td> </tr> <tr> <td style="text-align:right;"> 1957 </td> <td style="text-align:left;"> F </td> <td style="text-align:left;"> Kelly </td> <td style="text-align:right;"> 1909 </td> <td style="text-align:right;"> 0.001 </td> </tr> <tr> <td style="text-align:right;"> 2004 </td> <td style="text-align:left;"> F </td> <td style="text-align:left;"> Gizell </td> <td style="text-align:right;"> 8 </td> <td style="text-align:right;"> 0.000 </td> </tr> </tbody> </table> --- layout: true # Most popular baby names in 2015 --- .pull-left[ ```r babynames_pop2015 <- babynames %>% filter(year == 2015) %>% mutate( n = n/1000, sex = case_when( sex == "F" ~ "Girl Names", TRUE ~ "Boy Names" )) %>% group_by(sex) %>% top_n(10, n) ``` ] .pull-right[ <table> <thead> <tr> <th style="text-align:right;"> year </th> <th style="text-align:left;"> sex </th> <th style="text-align:left;"> name </th> <th style="text-align:right;"> n </th> <th style="text-align:right;"> prop </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 2015 </td> <td style="text-align:left;"> Boy Names </td> <td style="text-align:left;"> Noah </td> <td style="text-align:right;"> 19.511 </td> <td style="text-align:right;"> 0.010 </td> </tr> <tr> <td style="text-align:right;"> 2015 </td> <td style="text-align:left;"> Boy Names </td> <td style="text-align:left;"> Liam </td> <td style="text-align:right;"> 18.281 </td> <td style="text-align:right;"> 0.009 </td> </tr> <tr> <td style="text-align:right;"> 2015 </td> <td style="text-align:left;"> Boy Names </td> <td style="text-align:left;"> Mason </td> <td style="text-align:right;"> 16.535 </td> <td style="text-align:right;"> 0.008 </td> </tr> <tr> <td style="text-align:right;"> 2015 </td> <td style="text-align:left;"> Boy Names </td> <td style="text-align:left;"> Jacob </td> <td style="text-align:right;"> 15.816 </td> <td style="text-align:right;"> 0.008 </td> </tr> <tr> <td style="text-align:right;"> 2015 </td> <td style="text-align:left;"> Girl Names </td> <td style="text-align:left;"> Emma </td> <td style="text-align:right;"> 20.355 </td> <td style="text-align:right;"> 0.011 </td> </tr> <tr> <td style="text-align:right;"> 2015 </td> <td style="text-align:left;"> Girl Names </td> <td style="text-align:left;"> Olivia </td> <td style="text-align:right;"> 19.553 </td> <td style="text-align:right;"> 0.010 </td> </tr> <tr> <td style="text-align:right;"> 2015 </td> <td style="text-align:left;"> Girl Names </td> <td style="text-align:left;"> Sophia </td> <td style="text-align:right;"> 17.327 </td> <td style="text-align:right;"> 0.009 </td> </tr> <tr> <td style="text-align:right;"> 2015 </td> <td style="text-align:left;"> Girl Names </td> <td style="text-align:left;"> Ava </td> <td style="text-align:right;"> 16.286 </td> <td style="text-align:right;"> 0.008 </td> </tr> </tbody> </table> ] --- ```r g_babynames <- ggplot(babynames_pop2015) + * aes(y = n, x = name) + * geom_col() ``` .plot-callout[ <img src="index_files/figure-html/unnamed-chunk-19-1.png" width="306" height="99%" /> ] --- ```r g_babynames <- ggplot(babynames_pop2015) + aes(y = n, x = name) + geom_col() + * coord_flip() ``` .plot-callout[ <img src="index_files/figure-html/unnamed-chunk-20-1.png" width="306" height="99%" /> ] --- ```r g_babynames <- ggplot(babynames_pop2015) + * aes(y = n, x = fct_reorder(name, n)) + geom_col() + coord_flip() ``` <br>📦 `fct_reorder` comes from the tidyverse package `forecats` .plot-callout[ <img src="index_files/figure-html/unnamed-chunk-21-1.png" width="306" height="99%" /> ] --- ```r g_babynames <- ggplot(babynames_pop2015) + * aes(y = n, x = fct_reorder(name, n), fill = sex) + geom_col() + coord_flip() ``` .plot-callout[ <img src="index_files/figure-html/unnamed-chunk-22-1.png" width="306" height="99%" /> ] --- ```r g_babynames <- ggplot(babynames_pop2015) + aes(y = n, x = fct_reorder(name, n), fill = sex) + geom_col() + coord_flip() + * facet_wrap( ~ sex, scales = 'free_y') ``` .plot-callout[ <img src="index_files/figure-html/unnamed-chunk-23-1.png" width="306" height="99%" /> ] --- ```r g_babynames <- ggplot(babynames_pop2015) + aes(y = n, x = fct_reorder(name, n), fill = sex) + geom_col() + * geom_text( * aes(label = format(n*1000, big.mark = ',')), * size = 9, hjust = 1.1, * color = 'white', family = 'Fira Sans' * ) + coord_flip() + facet_wrap( ~ sex, scales = 'free_y') ``` .plot-callout[ <img src="index_files/figure-html/unnamed-chunk-24-1.png" width="306" height="99%" /> ] --- ```r g_babynames + labs(x = '', y = 'Number of Babies Born in 2015 (thousands)') + guides(fill = FALSE) + scale_fill_manual( values = c("Boy Names" = "#77cbb9", "Girl Names" = "#a077cb")) + theme( strip.text = element_text(face = 'bold', size = 20), strip.background = element_blank(), text = element_text(size = 24) ) ``` .plot-callout[ <img src="index_files/figure-html/babynames-popular-out-callout-1.png" width="306" height="99%" /> ] --- <img src="index_files/figure-html/babynames-popular-out-1.png" width="100%" /> --- layout: true # Gender-bending baby names --- Find babynames that were 1. More "boyish" or "girlish" in pre-1900s and opposite in post-1900s 2. Pick top 10 boy ↔ girl names -- **Boy → Girl Names:**<br> Madison, Ashley, Alexis, Lauren, Taylor, Addison, Sydney, Allison, Morgan, Aubrey **Girl → Boy Names:**<br> Ollie, Jean, Lou, Cruz, Frankie, Alpha, Artie, Vinnie, Donnie, Lue --- .pull-left[ Data-preprocessing: 1. Un-tidy `sex` column into `Female` and `Male` 2. Calculate difference in proportion by name 3. Add groups for area plot (thank you [stackoverflow](https://stackoverflow.com/a/7883556)!) <br><br>Check out `babynames-prep.R` in repo ] .pull-right[ <table> <thead> <tr> <th style="text-align:right;"> year </th> <th style="text-align:left;"> name </th> <th style="text-align:right;"> prop </th> <th style="text-align:right;"> prop_group </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1884 </td> <td style="text-align:left;"> Taylor </td> <td style="text-align:right;"> -0.00017 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:right;"> 1901 </td> <td style="text-align:left;"> Jean </td> <td style="text-align:right;"> 0.00103 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:right;"> 1917 </td> <td style="text-align:left;"> Frankie </td> <td style="text-align:right;"> 0.00027 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:right;"> 1937 </td> <td style="text-align:left;"> Allison </td> <td style="text-align:right;"> -0.00002 </td> <td style="text-align:right;"> 1 </td> </tr> <tr> <td style="text-align:right;"> 1967 </td> <td style="text-align:left;"> Frankie </td> <td style="text-align:right;"> -0.00024 </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:right;"> 1981 </td> <td style="text-align:left;"> Donnie </td> <td style="text-align:right;"> -0.00024 </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:right;"> 1984 </td> <td style="text-align:left;"> Allison </td> <td style="text-align:right;"> 0.00334 </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:right;"> 1994 </td> <td style="text-align:left;"> Ollie </td> <td style="text-align:right;"> -0.00001 </td> <td style="text-align:right;"> 4 </td> </tr> <tr> <td style="text-align:right;"> 2010 </td> <td style="text-align:left;"> Addison </td> <td style="text-align:right;"> 0.00518 </td> <td style="text-align:right;"> 2 </td> </tr> <tr> <td style="text-align:right;"> 2014 </td> <td style="text-align:left;"> Morgan </td> <td style="text-align:right;"> 0.00134 </td> <td style="text-align:right;"> 2 </td> </tr> </tbody> </table> ] --- ```r ggplot(sel_change_babynames) + aes(x = year, y = prop) ``` .plot-callout[ <img src="index_files/figure-html/genben-plot1-out-1.png" width="306" height="99%" /> ] --- ```r ggplot(sel_change_babynames) + aes(x = year, y = prop) + * geom_line(color = "grey50", aes(group=name)) ``` .plot-callout[ <img src="index_files/figure-html/genben-plot2-out-1.png" width="306" height="99%" /> ] --- ```r ggplot(sel_change_babynames) + aes(x = year, y = prop, fill = prop > 0) + * geom_area(aes(group = prop_group)) + geom_line(color = "grey50", aes(group=name))+ * facet_wrap(~ name, scales = 'free_y', ncol = 5) ``` .plot-callout[ <img src="index_files/figure-html/genben-plot3-out-1.png" width="306" height="99%" /> ] --- ```r g_bnc <- ggplot(sel_change_babynames) + aes(x = year, y = prop, fill = prop > 0) + geom_area(aes(group = prop_group)) + geom_line(color = "grey50", aes(group=name))+ facet_wrap(~ name, scales = 'free_y', ncol = 5) + * scale_fill_manual(values = c("#6ec4db", "#fa7c92")) + * guides(fill = FALSE) + * labs(x = '', y = '') g_bnc ``` .plot-callout[ <img src="index_files/figure-html/genben-plot4-out-1.png" width="306" height="99%" /> ] --- ```r g_bnc <- g_bnc + theme_minimal(base_family = 'Palatino') + theme( axis.text.y = element_blank(), strip.text = element_text(size = 18, face = 'bold'), panel.grid.major.y = element_blank(), panel.grid.minor.y = element_blank(), panel.grid.minor.x = element_blank(), panel.grid.major.x = element_line(color = "grey80", linetype = 3)) ``` .plot-callout[ <img src="index_files/figure-html/genben-plot5-out-1.png" width="306" height="99%" /> ] --- <img src="index_files/figure-html/genben-plot-out-1.png" width="1152" height="99%" /> --- layout: false class: inverse, middle, center # g is for Goodbye --- layout: true # Stack Exchange is Awesome --- ![](images/stack-exchange-search.png) --- ![](images/stack-exchange-answer.png) --- layout: false # ggplot2 Extensions: ggplot2-exts.org ![](images/ggplot2-exts-gallery.png) --- # ggplot2 and beyond ### Learn more - **ggplot2 docs:** <http://ggplot2.tidyverse.org/> - **R4DS - Data visualization:** <http://r4ds.had.co.nz/data-visualisation.html> - **Hadley Wickham's ggplot2 book:** <https://www.amazon.com/dp/0387981403/> ### Noteworthy RStudio Add-Ins - [ggplotThemeAssist](https://github.com/calligross/ggthemeassist): Customize your ggplot theme interactively - [ggedit](https://github.com/metrumresearchgroup/ggedit): Layer, scale, and theme editing --- # Practice and Review ### Fun Datasets - `fivethirtyeight` - `nycflights` - `ggplot2movies` - `population` and `who` in `tidyr` ### Review - Slides and code on GitHub: <http://github.com/gadenbuie/trug-ggplot2> --- class: inverse, center, middle # Thanks! .font150.text-white[ @grrrck <br> github.com/gadenbuie <br> Garrick Aden-Buie ]