1 Intro
In this post, we try to find a minimal setup for running/fitting a predictive model using the tidymodels approach.
2 Load packages
library(tidyverse) # data wrangling
library(tidymodels)
3 Data
data("penguins", package = "modeldata")
4 Minimal code for fitting a model
m1 <- linear_reg(engine = "glmnet", penalty = 1, mixture = 1) %>%
fit(body_mass_g ~ ., data = penguins)
Note that, for simplicity, we do not care about cross-validation, tuning and preprocessing. In particular, we should normalize the metric predictors and dummytize the nominmal predictors.
We do not even use tidymodels’ workflow
approach here for the sake of minimalism.
I’m not saying that I would recommend this minimal approach though.
5 Results
The tidy
method from {broom}
offers an handy approach to get the model parameters:
m1 %>%
tidy()
#> # A tibble: 10 × 3
#> term estimate penalty
#> <chr> <dbl> <dbl>
#> 1 (Intercept) 82320. 1
#> 2 speciesChinstrap -275. 1
#> 3 speciesGentoo 873. 1
#> 4 islandDream -19.3 1
#> 5 islandTorgersen -53.2 1
#> 6 bill_length_mm 18.4 1
#> 7 bill_depth_mm 55.5 1
#> 8 flipper_length_mm 18.7 1
#> 9 sexmale 386. 1
#> 10 year -41.9 1
In case any predictor beta has been shrunken to zero, we would get a note, see for instance this post
6 Reproducibility
#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.2.0 (2022-04-22)
#> os macOS Big Sur/Monterey 10.16
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Berlin
#> date 2022-07-24
#> pandoc 2.18 @ /usr/local/bin/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.2.0)
#> backports 1.4.1 2021-12-13 [1] CRAN (R 4.2.0)
#> blogdown 1.10 2022-05-10 [1] CRAN (R 4.2.0)
#> bookdown 0.27 2022-06-14 [1] CRAN (R 4.2.0)
#> brio 1.1.3 2021-11-30 [1] CRAN (R 4.2.0)
#> broom * 1.0.0 2022-07-01 [1] CRAN (R 4.2.0)
#> bslib 0.3.1 2021-10-06 [1] CRAN (R 4.2.0)
#> cachem 1.0.6 2021-08-19 [1] CRAN (R 4.2.0)
#> callr 3.7.0 2021-04-20 [1] CRAN (R 4.2.0)
#> cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.2.0)
#> class 7.3-20 2022-01-16 [2] CRAN (R 4.2.0)
#> cli 3.3.0 2022-04-25 [1] CRAN (R 4.2.0)
#> codetools 0.2-18 2020-11-04 [2] CRAN (R 4.2.0)
#> colorout * 1.2-2 2022-06-13 [1] local
#> colorspace 2.0-3 2022-02-21 [1] CRAN (R 4.2.0)
#> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.2.0)
#> DBI 1.1.2 2021-12-20 [1] CRAN (R 4.2.0)
#> dbplyr 2.2.0 2022-06-05 [1] CRAN (R 4.2.0)
#> desc 1.4.1 2022-03-06 [1] CRAN (R 4.2.0)
#> devtools 2.4.3 2021-11-30 [1] CRAN (R 4.2.0)
#> dials * 1.0.0 2022-06-14 [1] CRAN (R 4.2.0)
#> DiceDesign 1.9 2021-02-13 [1] CRAN (R 4.2.0)
#> digest 0.6.29 2021-12-01 [1] CRAN (R 4.2.0)
#> dplyr * 1.0.9 2022-04-28 [1] CRAN (R 4.2.0)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
#> evaluate 0.15 2022-02-18 [1] CRAN (R 4.2.0)
#> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.2.0)
#> fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.2.0)
#> forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.2.0)
#> foreach 1.5.2 2022-02-02 [1] CRAN (R 4.2.0)
#> fs 1.5.2 2021-12-08 [1] CRAN (R 4.2.0)
#> furrr 0.3.0 2022-05-04 [1] CRAN (R 4.2.0)
#> future 1.26.1 2022-05-27 [1] CRAN (R 4.2.0)
#> future.apply 1.9.0 2022-04-25 [1] CRAN (R 4.2.0)
#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0)
#> ggplot2 * 3.3.6 2022-05-03 [1] CRAN (R 4.2.0)
#> glmnet * 4.1-4 2022-04-15 [1] CRAN (R 4.2.0)
#> globals 0.15.0 2022-05-09 [1] CRAN (R 4.2.0)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
#> gower 1.0.0 2022-02-03 [1] CRAN (R 4.2.0)
#> GPfit 1.0-8 2019-02-08 [1] CRAN (R 4.2.0)
#> gtable 0.3.0 2019-03-25 [1] CRAN (R 4.2.0)
#> hardhat 1.2.0 2022-06-30 [1] CRAN (R 4.2.0)
#> haven 2.5.0 2022-04-15 [1] CRAN (R 4.2.0)
#> hms 1.1.1 2021-09-26 [1] CRAN (R 4.2.0)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.2.0)
#> httr 1.4.3 2022-05-04 [1] CRAN (R 4.2.0)
#> infer * 1.0.2 2022-06-10 [1] CRAN (R 4.2.0)
#> ipred 0.9-13 2022-06-02 [1] CRAN (R 4.2.0)
#> iterators 1.0.14 2022-02-05 [1] CRAN (R 4.2.0)
#> jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.2.0)
#> jsonlite 1.8.0 2022-02-22 [1] CRAN (R 4.2.0)
#> knitr 1.39 2022-04-26 [1] CRAN (R 4.2.0)
#> lattice 0.20-45 2021-09-22 [2] CRAN (R 4.2.0)
#> lava 1.6.10 2021-09-02 [1] CRAN (R 4.2.0)
#> lhs 1.1.5 2022-03-22 [1] CRAN (R 4.2.0)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.2.0)
#> listenv 0.8.0 2019-12-05 [1] CRAN (R 4.2.0)
#> lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.2.0)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
#> MASS 7.3-56 2022-03-23 [2] CRAN (R 4.2.0)
#> Matrix * 1.4-1 2022-03-23 [2] CRAN (R 4.2.0)
#> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.0)
#> modeldata * 1.0.0 2022-07-01 [1] CRAN (R 4.2.0)
#> modelr 0.1.8 2020-05-19 [1] CRAN (R 4.2.0)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0)
#> nnet 7.3-17 2022-01-16 [2] CRAN (R 4.2.0)
#> palmerpenguins 0.1.0 2020-07-23 [1] CRAN (R 4.2.0)
#> parallelly 1.32.0 2022-06-07 [1] CRAN (R 4.2.0)
#> parsnip * 1.0.0 2022-06-16 [1] CRAN (R 4.2.0)
#> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.2.0)
#> pkgbuild 1.3.1 2021-12-20 [1] CRAN (R 4.2.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
#> pkgload 1.2.4 2021-11-30 [1] CRAN (R 4.2.0)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.2.0)
#> processx 3.6.1 2022-06-17 [1] CRAN (R 4.2.0)
#> prodlim 2019.11.13 2019-11-17 [1] CRAN (R 4.2.0)
#> ps 1.7.1 2022-06-18 [1] CRAN (R 4.2.0)
#> purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.2.0)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
#> Rcpp 1.0.8.3 2022-03-17 [1] CRAN (R 4.2.0)
#> readr * 2.1.2 2022-01-30 [1] CRAN (R 4.2.0)
#> readxl 1.4.0 2022-03-28 [1] CRAN (R 4.2.0)
#> recipes * 1.0.1 2022-07-07 [1] CRAN (R 4.2.0)
#> remotes 2.4.2 2021-11-30 [1] CRAN (R 4.2.0)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.2.0)
#> rlang 1.0.3 2022-06-27 [1] CRAN (R 4.2.0)
#> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.2.0)
#> rpart 4.1.16 2022-01-24 [2] CRAN (R 4.2.0)
#> rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.2.0)
#> rsample * 1.0.0 2022-06-24 [1] CRAN (R 4.2.0)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.2.0)
#> rvest 1.0.2 2021-10-16 [1] CRAN (R 4.2.0)
#> sass 0.4.1 2022-03-23 [1] CRAN (R 4.2.0)
#> scales * 1.2.0 2022-04-13 [1] CRAN (R 4.2.0)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
#> shape 1.4.6 2021-05-19 [1] CRAN (R 4.2.0)
#> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.2.0)
#> stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.2.0)
#> survival 3.3-1 2022-03-03 [2] CRAN (R 4.2.0)
#> testthat 3.1.4 2022-04-26 [1] CRAN (R 4.2.0)
#> tibble * 3.1.7 2022-05-03 [1] CRAN (R 4.2.0)
#> tidymodels * 1.0.0 2022-07-13 [1] CRAN (R 4.2.0)
#> tidyr * 1.2.0 2022-02-01 [1] CRAN (R 4.2.0)
#> tidyselect 1.1.2 2022-02-21 [1] CRAN (R 4.2.0)
#> tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.2.0)
#> timeDate 3043.102 2018-02-21 [1] CRAN (R 4.2.0)
#> tune * 1.0.0 2022-07-07 [1] CRAN (R 4.2.0)
#> tzdb 0.3.0 2022-03-28 [1] CRAN (R 4.2.0)
#> usethis 2.1.6 2022-05-25 [1] CRAN (R 4.2.0)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.2.0)
#> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.2.0)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0)
#> workflows * 1.0.0 2022-07-05 [1] CRAN (R 4.2.0)
#> workflowsets * 1.0.0 2022-07-12 [1] CRAN (R 4.2.0)
#> xfun 0.31 2022-05-10 [1] CRAN (R 4.2.0)
#> xml2 1.3.3 2021-11-30 [1] CRAN (R 4.2.0)
#> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.2.0)
#> yardstick * 1.0.0 2022-06-06 [1] CRAN (R 4.2.0)
#>
#> [1] /Users/sebastiansaueruser/Rlibs
#> [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#>
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────