library(tidyverse)
1 Exercise collection: Life exptectancy
Get the data from this source.
gapminder_raw <- read_csv("https://raw.githubusercontent.com/swcarpentry/r-novice-gapminder/gh-pages/_episodes_rmd/data/gapminder-FiveYearData.csv")
2 Disclosure
This exercises are based on a tutorial by Rebekka Barter. Great work!
3 Research questions
How did life expectancy change in the course of the last decades? Did id change differently between the continents?
How does life expectancy differs today between the continents?
Is life expectancy related to GDP? If so, to what degree (and form)? Is this assocication moderated by continent?
4 First steps
First, open a script or Rmd file in RStudio. Next make sure you start (“load”) the necessary R packages, and import the data.
5 Getting help
- Data wrangling cheatsheet
- Data vizualization cheatsheet
- It’s a low brainer, but it works: Just google for it. For example, if you are struggling how to reduce transparency of dots in a (ggplot2) plot, try “ggplot2 reduce transparency points” or similar queries.
6 Exercises
6.1 Data Wrangling
Filter the data for the Americas in 2007, deselect all other variables.
Create the variable
gdp
, defined as the product of population size and gdp per person.Identify the observation with lowest gdp per person.
Identify all observations with above average life expectancy, stratified for each continent.
Count the observations identified in the last step.
Compute the mean life expectancy (the grand mean; ie., across all observations).
Compute the mean life expectancy for each year.
6.2 Data Visualization
Create a scatter plot showing the association of gdp per person and life expectancy. Put the putative cause on the X axis and the putative effect on the y axis.
Add a rolling average line (also known as LOESS smoother).
Add a linear model line.
Create a scatter plot with year on the x axis, and life expectancy on the y axis. Each point should indicate the average life expectancy per year. Connect the dots with a line.
Modify the last plot so that there is a line for each continent (ie., group by continent).
Create a scatter plot showing the assocation of gdp per person and life expectancy. Put the putative cause on the X axis and the putative effect on the y axis. The color of the dots should map to the respective continent.
Modify the last plot so that the size of the dots represents the population size. In addition, increase the transparency of the dots in order to mitigate overplotting.
Modify the last plot so that there’s a facet (sub-plot) for each continent.
Modify the last plot so that GDP is log transformed.
7 Solutions
You’ll find the solutions here.
8 Reproducibility
#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.0.2 (2020-06-22)
#> os macOS 10.16
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Berlin
#> date 2021-02-24
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.0)
#> blogdown 1.1 2021-01-19 [1] CRAN (R 4.0.2)
#> bookdown 0.21.6 2021-02-02 [1] Github (rstudio/bookdown@6c7346a)
#> bslib 0.2.4.9000 2021-02-02 [1] Github (rstudio/bslib@b3cd7a9)
#> cachem 1.0.4 2021-02-13 [1] CRAN (R 4.0.2)
#> callr 3.5.1 2020-10-13 [1] CRAN (R 4.0.2)
#> cli 2.3.1 2021-02-23 [1] CRAN (R 4.0.2)
#> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.0.2)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.0)
#> devtools 2.3.2 2020-09-18 [1] CRAN (R 4.0.2)
#> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.2)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.0)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.0.2)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
#> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.2)
#> jquerylib 0.1.3 2020-12-17 [1] CRAN (R 4.0.2)
#> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.0.2)
#> knitr 1.31 2021-01-27 [1] CRAN (R 4.0.2)
#> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.0.2)
#> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.2)
#> memoise 2.0.0 2021-01-26 [1] CRAN (R 4.0.2)
#> pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 4.0.2)
#> pkgload 1.2.0 2021-02-23 [1] CRAN (R 4.0.2)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.0)
#> processx 3.4.5 2020-11-30 [1] CRAN (R 4.0.2)
#> ps 1.5.0 2020-12-05 [1] CRAN (R 4.0.2)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.0)
#> R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.2)
#> remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.2)
#> rlang 0.4.10 2020-12-30 [1] CRAN (R 4.0.2)
#> rmarkdown 2.7 2021-02-19 [1] CRAN (R 4.0.2)
#> rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.0.2)
#> sass 0.3.1 2021-01-24 [1] CRAN (R 4.0.2)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.0)
#> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.0)
#> testthat 3.0.2 2021-02-14 [1] CRAN (R 4.0.2)
#> usethis 2.0.1 2021-02-10 [1] CRAN (R 4.0.2)
#> withr 2.4.1 2021-01-26 [1] CRAN (R 4.0.2)
#> xfun 0.21 2021-02-10 [1] CRAN (R 4.0.2)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0)
#>
#> [1] /Users/sebastiansaueruser/Rlibs
#> [2] /Library/Frameworks/R.framework/Versions/4.0/Resources/library