1 Load packages
library(tidyverse) # data wrangling
2 Motivation
Let’s assume we would like to change the values of multiple variables depending in the state of another variable.
For the sake of concreteness, let’s say we have some variable called data_trustworthiness
. If this variable (indicating whether or not we can have confidence in some other variables) has the value FALSE
for some cases, we would like to set the varialbe measure1
and measure2
to NA
, thus reflecting that the data from our measurements are not reliable.
Let’s use the tidyverse ecosystem.
3 Minimal example
For the sake of simplicity, we’ll make use of the mtcars
dataset.
First, we build a variable trustworthy
with (for this minimal example) random values.
data(mtcars)
mtcars <-
mtcars |>
mutate(trustworthy = sample(c(1, 0), size = nrow(mtcars), replace = TRUE)) |>
relocate(trustworthy)
glimpse(mtcars) # check
#> Rows: 32
#> Columns: 12
#> $ trustworthy <dbl> 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1…
#> $ mpg <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2…
#> $ cyl <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4…
#> $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 14…
#> $ hp <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 1…
#> $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92…
#> $ wt <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.…
#> $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22…
#> $ vs <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1…
#> $ am <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1…
#> $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4…
#> $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1…
Next comes the actual conditional change of the data variables (data_vars
).
mtcars_new <-
mtcars |>
mutate(across(.cols = c("mpg", "cyl"),
.fns = ~ifelse(trustworthy == 0, NA, .x)))
glimpse(mtcars_new) # check
#> Rows: 32
#> Columns: 12
#> $ trustworthy <dbl> 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1…
#> $ mpg <dbl> 21.0, NA, NA, NA, NA, NA, NA, 24.4, NA, 19.2, 17.8, NA, 17…
#> $ cyl <dbl> 6, NA, NA, NA, NA, NA, NA, 4, NA, 6, 6, NA, 8, NA, 8, NA, …
#> $ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 14…
#> $ hp <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 1…
#> $ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92…
#> $ wt <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.…
#> $ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22…
#> $ vs <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1…
#> $ am <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1…
#> $ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4…
#> $ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1…
sum(is.na(mtcars_new))
#> [1] 42
head(mtcars_new)
#> trustworthy mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 1 21 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 0 NA NA 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710 0 NA NA 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive 0 NA NA 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout 0 NA NA 360 175 3.15 3.440 17.02 0 0 3 2
#> Valiant 0 NA NA 225 105 2.76 3.460 20.22 1 0 3 1
4 Reproducibility
#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.2.1 (2022-06-23)
#> os macOS Big Sur ... 10.16
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Berlin
#> date 2023-11-04
#> pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> blogdown 1.18 2023-06-19 [1] CRAN (R 4.2.0)
#> bookdown 0.36 2023-10-16 [1] CRAN (R 4.2.0)
#> bslib 0.5.1 2023-08-11 [1] CRAN (R 4.2.0)
#> cachem 1.0.8 2023-05-01 [1] CRAN (R 4.2.0)
#> callr 3.7.3 2022-11-02 [1] CRAN (R 4.2.0)
#> cli 3.6.1 2023-03-23 [1] CRAN (R 4.2.0)
#> codetools 0.2-19 2023-02-01 [1] CRAN (R 4.2.0)
#> colorout * 1.2-2 2022-06-13 [1] local
#> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.2.0)
#> crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.1)
#> devtools 2.4.5 2022-10-11 [1] CRAN (R 4.2.1)
#> digest 0.6.33 2023-07-07 [1] CRAN (R 4.2.0)
#> dplyr * 1.1.3 2023-09-03 [1] CRAN (R 4.2.0)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
#> evaluate 0.21 2023-05-05 [1] CRAN (R 4.2.0)
#> fansi 1.0.5 2023-10-08 [1] CRAN (R 4.2.0)
#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.2.0)
#> forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.2.0)
#> fs 1.6.3 2023-07-20 [1] CRAN (R 4.2.0)
#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0)
#> ggplot2 * 3.4.4 2023-10-12 [1] CRAN (R 4.2.0)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
#> gtable 0.3.4 2023-08-21 [1] CRAN (R 4.2.0)
#> hms 1.1.3 2023-03-21 [1] CRAN (R 4.2.0)
#> htmltools 0.5.6.1 2023-10-06 [1] CRAN (R 4.2.0)
#> htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.2.0)
#> httpuv 1.6.11 2023-05-11 [1] CRAN (R 4.2.0)
#> jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.2.0)
#> jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.2.0)
#> knitr 1.45 2023-10-30 [1] CRAN (R 4.2.1)
#> later 1.3.1 2023-05-02 [1] CRAN (R 4.2.0)
#> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.0)
#> lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.2.0)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
#> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.0)
#> mime 0.12 2021-09-28 [1] CRAN (R 4.2.0)
#> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.2.0)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0)
#> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.2.0)
#> pkgbuild 1.4.0 2022-11-27 [1] CRAN (R 4.2.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
#> pkgload 1.3.2.1 2023-07-08 [1] CRAN (R 4.2.0)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.2.0)
#> processx 3.8.2 2023-06-30 [1] CRAN (R 4.2.0)
#> profvis 0.3.8 2023-05-02 [1] CRAN (R 4.2.0)
#> promises 1.2.1 2023-08-10 [1] CRAN (R 4.2.0)
#> ps 1.7.5 2023-04-18 [1] CRAN (R 4.2.0)
#> purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.2.0)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
#> Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.2.0)
#> readr * 2.1.4 2023-02-10 [1] CRAN (R 4.2.0)
#> remotes 2.4.2.1 2023-07-18 [1] CRAN (R 4.2.0)
#> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.2.0)
#> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.2.0)
#> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.2.0)
#> sass 0.4.7 2023-07-15 [1] CRAN (R 4.2.0)
#> scales 1.2.1 2022-08-20 [1] CRAN (R 4.2.0)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
#> shiny 1.7.5 2023-08-12 [1] CRAN (R 4.2.0)
#> stringi 1.7.12 2023-01-11 [1] CRAN (R 4.2.0)
#> stringr * 1.5.0 2022-12-02 [1] CRAN (R 4.2.0)
#> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.2.0)
#> tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.2.0)
#> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.0)
#> tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.2.0)
#> timechange 0.2.0 2023-01-11 [1] CRAN (R 4.2.0)
#> tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.2.0)
#> urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.2.0)
#> usethis 2.2.2 2023-07-06 [1] CRAN (R 4.2.0)
#> utf8 1.2.3 2023-01-31 [1] CRAN (R 4.2.0)
#> vctrs 0.6.4 2023-10-12 [1] CRAN (R 4.2.0)
#> withr 2.5.2 2023-10-30 [1] CRAN (R 4.2.1)
#> xfun 0.40 2023-08-09 [1] CRAN (R 4.2.0)
#> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.2.0)
#> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.2.0)
#>
#> [1] /Users/sebastiansaueruser/Rlibs
#> [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#>
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────