Minimal tidymodels example with the Lasso

1 Intro

In this post, we try to find a minimal setup for running/fitting a predictive model using the tidymodels approach.

2 Load packages

library(tidyverse)  # data wrangling
library(tidymodels)

3 Data

data("penguins", package = "modeldata")

4 Minimal code for fitting a model

m1 <- linear_reg(engine = "glmnet", penalty = 1, mixture = 1) %>% 
  fit(body_mass_g ~ ., data = penguins)

Note that, for simplicity, we do not care about cross-validation, tuning and preprocessing. In particular, we should normalize the metric predictors and dummytize the nominmal predictors.

We do not even use tidymodels’ workflow approach here for the sake of minimalism. I’m not saying that I would recommend this minimal approach though.

5 Results

The tidy method from {broom} offers an handy approach to get the model parameters:

m1 %>% 
  tidy()
#> # A tibble: 10 × 3
#>    term              estimate penalty
#>    <chr>                <dbl>   <dbl>
#>  1 (Intercept)        82320.        1
#>  2 speciesChinstrap    -275.        1
#>  3 speciesGentoo        873.        1
#>  4 islandDream          -19.3       1
#>  5 islandTorgersen      -53.2       1
#>  6 bill_length_mm        18.4       1
#>  7 bill_depth_mm         55.5       1
#>  8 flipper_length_mm     18.7       1
#>  9 sexmale              386.        1
#> 10 year                 -41.9       1

In case any predictor beta has been shrunken to zero, we would get a note, see for instance this post

6 Reproducibility

#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.0 (2022-04-22)
#>  os       macOS Big Sur/Monterey 10.16
#>  system   x86_64, darwin17.0
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Berlin
#>  date     2022-07-24
#>  pandoc   2.18 @ /usr/local/bin/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#>  package        * version    date (UTC) lib source
#>  assertthat       0.2.1      2019-03-21 [1] CRAN (R 4.2.0)
#>  backports        1.4.1      2021-12-13 [1] CRAN (R 4.2.0)
#>  blogdown         1.10       2022-05-10 [1] CRAN (R 4.2.0)
#>  bookdown         0.27       2022-06-14 [1] CRAN (R 4.2.0)
#>  brio             1.1.3      2021-11-30 [1] CRAN (R 4.2.0)
#>  broom          * 1.0.0      2022-07-01 [1] CRAN (R 4.2.0)
#>  bslib            0.3.1      2021-10-06 [1] CRAN (R 4.2.0)
#>  cachem           1.0.6      2021-08-19 [1] CRAN (R 4.2.0)
#>  callr            3.7.0      2021-04-20 [1] CRAN (R 4.2.0)
#>  cellranger       1.1.0      2016-07-27 [1] CRAN (R 4.2.0)
#>  class            7.3-20     2022-01-16 [2] CRAN (R 4.2.0)
#>  cli              3.3.0      2022-04-25 [1] CRAN (R 4.2.0)
#>  codetools        0.2-18     2020-11-04 [2] CRAN (R 4.2.0)
#>  colorout       * 1.2-2      2022-06-13 [1] local
#>  colorspace       2.0-3      2022-02-21 [1] CRAN (R 4.2.0)
#>  crayon           1.5.1      2022-03-26 [1] CRAN (R 4.2.0)
#>  DBI              1.1.2      2021-12-20 [1] CRAN (R 4.2.0)
#>  dbplyr           2.2.0      2022-06-05 [1] CRAN (R 4.2.0)
#>  desc             1.4.1      2022-03-06 [1] CRAN (R 4.2.0)
#>  devtools         2.4.3      2021-11-30 [1] CRAN (R 4.2.0)
#>  dials          * 1.0.0      2022-06-14 [1] CRAN (R 4.2.0)
#>  DiceDesign       1.9        2021-02-13 [1] CRAN (R 4.2.0)
#>  digest           0.6.29     2021-12-01 [1] CRAN (R 4.2.0)
#>  dplyr          * 1.0.9      2022-04-28 [1] CRAN (R 4.2.0)
#>  ellipsis         0.3.2      2021-04-29 [1] CRAN (R 4.2.0)
#>  evaluate         0.15       2022-02-18 [1] CRAN (R 4.2.0)
#>  fansi            1.0.3      2022-03-24 [1] CRAN (R 4.2.0)
#>  fastmap          1.1.0      2021-01-25 [1] CRAN (R 4.2.0)
#>  forcats        * 0.5.1      2021-01-27 [1] CRAN (R 4.2.0)
#>  foreach          1.5.2      2022-02-02 [1] CRAN (R 4.2.0)
#>  fs               1.5.2      2021-12-08 [1] CRAN (R 4.2.0)
#>  furrr            0.3.0      2022-05-04 [1] CRAN (R 4.2.0)
#>  future           1.26.1     2022-05-27 [1] CRAN (R 4.2.0)
#>  future.apply     1.9.0      2022-04-25 [1] CRAN (R 4.2.0)
#>  generics         0.1.3      2022-07-05 [1] CRAN (R 4.2.0)
#>  ggplot2        * 3.3.6      2022-05-03 [1] CRAN (R 4.2.0)
#>  glmnet         * 4.1-4      2022-04-15 [1] CRAN (R 4.2.0)
#>  globals          0.15.0     2022-05-09 [1] CRAN (R 4.2.0)
#>  glue             1.6.2      2022-02-24 [1] CRAN (R 4.2.0)
#>  gower            1.0.0      2022-02-03 [1] CRAN (R 4.2.0)
#>  GPfit            1.0-8      2019-02-08 [1] CRAN (R 4.2.0)
#>  gtable           0.3.0      2019-03-25 [1] CRAN (R 4.2.0)
#>  hardhat          1.2.0      2022-06-30 [1] CRAN (R 4.2.0)
#>  haven            2.5.0      2022-04-15 [1] CRAN (R 4.2.0)
#>  hms              1.1.1      2021-09-26 [1] CRAN (R 4.2.0)
#>  htmltools        0.5.2      2021-08-25 [1] CRAN (R 4.2.0)
#>  httr             1.4.3      2022-05-04 [1] CRAN (R 4.2.0)
#>  infer          * 1.0.2      2022-06-10 [1] CRAN (R 4.2.0)
#>  ipred            0.9-13     2022-06-02 [1] CRAN (R 4.2.0)
#>  iterators        1.0.14     2022-02-05 [1] CRAN (R 4.2.0)
#>  jquerylib        0.1.4      2021-04-26 [1] CRAN (R 4.2.0)
#>  jsonlite         1.8.0      2022-02-22 [1] CRAN (R 4.2.0)
#>  knitr            1.39       2022-04-26 [1] CRAN (R 4.2.0)
#>  lattice          0.20-45    2021-09-22 [2] CRAN (R 4.2.0)
#>  lava             1.6.10     2021-09-02 [1] CRAN (R 4.2.0)
#>  lhs              1.1.5      2022-03-22 [1] CRAN (R 4.2.0)
#>  lifecycle        1.0.1      2021-09-24 [1] CRAN (R 4.2.0)
#>  listenv          0.8.0      2019-12-05 [1] CRAN (R 4.2.0)
#>  lubridate        1.8.0      2021-10-07 [1] CRAN (R 4.2.0)
#>  magrittr         2.0.3      2022-03-30 [1] CRAN (R 4.2.0)
#>  MASS             7.3-56     2022-03-23 [2] CRAN (R 4.2.0)
#>  Matrix         * 1.4-1      2022-03-23 [2] CRAN (R 4.2.0)
#>  memoise          2.0.1      2021-11-26 [1] CRAN (R 4.2.0)
#>  modeldata      * 1.0.0      2022-07-01 [1] CRAN (R 4.2.0)
#>  modelr           0.1.8      2020-05-19 [1] CRAN (R 4.2.0)
#>  munsell          0.5.0      2018-06-12 [1] CRAN (R 4.2.0)
#>  nnet             7.3-17     2022-01-16 [2] CRAN (R 4.2.0)
#>  palmerpenguins   0.1.0      2020-07-23 [1] CRAN (R 4.2.0)
#>  parallelly       1.32.0     2022-06-07 [1] CRAN (R 4.2.0)
#>  parsnip        * 1.0.0      2022-06-16 [1] CRAN (R 4.2.0)
#>  pillar           1.7.0      2022-02-01 [1] CRAN (R 4.2.0)
#>  pkgbuild         1.3.1      2021-12-20 [1] CRAN (R 4.2.0)
#>  pkgconfig        2.0.3      2019-09-22 [1] CRAN (R 4.2.0)
#>  pkgload          1.2.4      2021-11-30 [1] CRAN (R 4.2.0)
#>  prettyunits      1.1.1      2020-01-24 [1] CRAN (R 4.2.0)
#>  processx         3.6.1      2022-06-17 [1] CRAN (R 4.2.0)
#>  prodlim          2019.11.13 2019-11-17 [1] CRAN (R 4.2.0)
#>  ps               1.7.1      2022-06-18 [1] CRAN (R 4.2.0)
#>  purrr          * 0.3.4      2020-04-17 [1] CRAN (R 4.2.0)
#>  R6               2.5.1      2021-08-19 [1] CRAN (R 4.2.0)
#>  Rcpp             1.0.8.3    2022-03-17 [1] CRAN (R 4.2.0)
#>  readr          * 2.1.2      2022-01-30 [1] CRAN (R 4.2.0)
#>  readxl           1.4.0      2022-03-28 [1] CRAN (R 4.2.0)
#>  recipes        * 1.0.1      2022-07-07 [1] CRAN (R 4.2.0)
#>  remotes          2.4.2      2021-11-30 [1] CRAN (R 4.2.0)
#>  reprex           2.0.1      2021-08-05 [1] CRAN (R 4.2.0)
#>  rlang            1.0.3      2022-06-27 [1] CRAN (R 4.2.0)
#>  rmarkdown        2.14       2022-04-25 [1] CRAN (R 4.2.0)
#>  rpart            4.1.16     2022-01-24 [2] CRAN (R 4.2.0)
#>  rprojroot        2.0.3      2022-04-02 [1] CRAN (R 4.2.0)
#>  rsample        * 1.0.0      2022-06-24 [1] CRAN (R 4.2.0)
#>  rstudioapi       0.13       2020-11-12 [1] CRAN (R 4.2.0)
#>  rvest            1.0.2      2021-10-16 [1] CRAN (R 4.2.0)
#>  sass             0.4.1      2022-03-23 [1] CRAN (R 4.2.0)
#>  scales         * 1.2.0      2022-04-13 [1] CRAN (R 4.2.0)
#>  sessioninfo      1.2.2      2021-12-06 [1] CRAN (R 4.2.0)
#>  shape            1.4.6      2021-05-19 [1] CRAN (R 4.2.0)
#>  stringi          1.7.6      2021-11-29 [1] CRAN (R 4.2.0)
#>  stringr        * 1.4.0      2019-02-10 [1] CRAN (R 4.2.0)
#>  survival         3.3-1      2022-03-03 [2] CRAN (R 4.2.0)
#>  testthat         3.1.4      2022-04-26 [1] CRAN (R 4.2.0)
#>  tibble         * 3.1.7      2022-05-03 [1] CRAN (R 4.2.0)
#>  tidymodels     * 1.0.0      2022-07-13 [1] CRAN (R 4.2.0)
#>  tidyr          * 1.2.0      2022-02-01 [1] CRAN (R 4.2.0)
#>  tidyselect       1.1.2      2022-02-21 [1] CRAN (R 4.2.0)
#>  tidyverse      * 1.3.1      2021-04-15 [1] CRAN (R 4.2.0)
#>  timeDate         3043.102   2018-02-21 [1] CRAN (R 4.2.0)
#>  tune           * 1.0.0      2022-07-07 [1] CRAN (R 4.2.0)
#>  tzdb             0.3.0      2022-03-28 [1] CRAN (R 4.2.0)
#>  usethis          2.1.6      2022-05-25 [1] CRAN (R 4.2.0)
#>  utf8             1.2.2      2021-07-24 [1] CRAN (R 4.2.0)
#>  vctrs            0.4.1      2022-04-13 [1] CRAN (R 4.2.0)
#>  withr            2.5.0      2022-03-03 [1] CRAN (R 4.2.0)
#>  workflows      * 1.0.0      2022-07-05 [1] CRAN (R 4.2.0)
#>  workflowsets   * 1.0.0      2022-07-12 [1] CRAN (R 4.2.0)
#>  xfun             0.31       2022-05-10 [1] CRAN (R 4.2.0)
#>  xml2             1.3.3      2021-11-30 [1] CRAN (R 4.2.0)
#>  yaml             2.3.5      2022-02-21 [1] CRAN (R 4.2.0)
#>  yardstick      * 1.0.0      2022-06-06 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Users/sebastiansaueruser/Rlibs
#>  [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────