1 Load packages
library(tidyverse) # data wrangling
library(easystats)
2 Data
data(mtcars)
3 Motivation
In this post, we’ll investigate the consequence of z-standardizing the predictor variables, and in addition the outcome variable in a simple logistic regression setting.
Do some coefficients change as a result of standardizing the values?
4 EDA
mtcars |>
group_by(am) |>
summarise(mpg_Avg = mean(mpg))
#> # A tibble: 2 × 2
#> am mpg_Avg
#> <dbl> <dbl>
#> 1 0 17.1
#> 2 1 24.4
As we can see, am=1
, i.e., manual (gear shifting) cars have a better mpg value.
5 Model with raw values
mod_raw <- glm(am ~ mpg, data = mtcars, family = "binomial")
parameters(mod_raw, exponentiate = TRUE)
#> Parameter | Odds Ratio | SE | 95% CI | z | p
#> ------------------------------------------------------------------
#> (Intercept) | 1.36e-03 | 3.19e-03 | [0.00, 0.06] | -2.81 | 0.005
#> mpg | 1.36 | 0.16 | [1.13, 1.80] | 2.67 | 0.008
The odds ratio of 1.36 means that for every one-unit increase in mpg, the odds of a car having an manual transmission increase by 36%.
Note that the logistic regression (in R) models the second level of the outcome variable (see here for more information).
6 Model with am
as factor-Variable
mtcars <-
mtcars |>
mutate(am_f = factor(am))
levels(mtcars$am_f)
#> [1] "0" "1"
mod_raw_f <- glm(am ~ mpg, data = mtcars, family = "binomial")
parameters(mod_raw, exponentiate = TRUE)
#> Parameter | Odds Ratio | SE | 95% CI | z | p
#> ------------------------------------------------------------------
#> (Intercept) | 1.36e-03 | 3.19e-03 | [0.00, 0.06] | -2.81 | 0.005
#> mpg | 1.36 | 0.16 | [1.13, 1.80] | 2.67 | 0.008
Identical!
7 Visualizing
pred_df <-
tibble(
mpg = seq(min(mtcars$mpg), max(mtcars$mpg), by = .1),
am_pred = predict(mod_raw, type = "response", newdata = tibble(mpg))
)
ggplot(mtcars) +
aes(x = mpg, y = am) +
geom_point() +
geom_line(data = pred_df, aes(x = mpg, y = am_pred), color = "blue") +
labs(title = "Predicting manual gear shifting",
subtitle = "Logistic model")
8 Standardizing predictors
mtcars_z <-
mtcars |>
mutate(across(c(everything(),-am), ~standardize(.x)))
9 Model with z-scaled predictors
mod_z <- glm(am ~ mpg, data = mtcars_z, family = "binomial")
parameters(mod_z, exponentiate = TRUE)
#> Parameter | Odds Ratio | SE | 95% CI | z | p
#> ---------------------------------------------------------------
#> (Intercept) | 0.65 | 0.29 | [0.25, 1.58] | -0.96 | 0.338
#> mpg | 6.36 | 4.40 | [2.09, 34.49] | 2.67 | 0.008
10 Model with all variables z-scaled
Note that it makes no sense to z-scale the outcome variable of a logistic regression.
11 Conclusion
As can be seen the Odds ratio gets really big after standardization.
12 Reproducibility
#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.2.1 (2022-06-23)
#> os macOS Big Sur ... 10.16
#> system x86_64, darwin17.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Berlin
#> date 2023-12-20
#> pandoc 3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> bayestestR * 0.13.1 2023-04-07 [1] CRAN (R 4.2.0)
#> blogdown 1.18 2023-06-19 [1] CRAN (R 4.2.0)
#> bookdown 0.36 2023-10-16 [1] CRAN (R 4.2.0)
#> bslib 0.5.1 2023-08-11 [1] CRAN (R 4.2.0)
#> cachem 1.0.8 2023-05-01 [1] CRAN (R 4.2.0)
#> callr 3.7.3 2022-11-02 [1] CRAN (R 4.2.0)
#> cli 3.6.1 2023-03-23 [1] CRAN (R 4.2.0)
#> coda 0.19-4 2020-09-30 [1] CRAN (R 4.2.0)
#> codetools 0.2-19 2023-02-01 [1] CRAN (R 4.2.0)
#> colorout * 1.3-0 2023-11-08 [1] Github (jalvesaq/colorout@8384882)
#> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.2.0)
#> correlation * 0.8.4 2023-04-06 [1] CRAN (R 4.2.1)
#> crayon 1.5.2 2022-09-29 [1] CRAN (R 4.2.1)
#> datawizard * 0.9.0 2023-09-15 [1] CRAN (R 4.2.0)
#> devtools 2.4.5 2022-10-11 [1] CRAN (R 4.2.1)
#> digest 0.6.33 2023-07-07 [1] CRAN (R 4.2.0)
#> dplyr * 1.1.3 2023-09-03 [1] CRAN (R 4.2.0)
#> easystats * 0.7.0 2023-11-05 [1] CRAN (R 4.2.1)
#> effectsize * 0.8.6 2023-09-14 [1] CRAN (R 4.2.0)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.2.0)
#> emmeans 1.8.9 2023-10-17 [1] CRAN (R 4.2.0)
#> estimability 1.4.1 2022-08-05 [1] CRAN (R 4.2.0)
#> evaluate 0.21 2023-05-05 [1] CRAN (R 4.2.0)
#> fansi 1.0.5 2023-10-08 [1] CRAN (R 4.2.0)
#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.2.0)
#> forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.2.0)
#> fs 1.6.3 2023-07-20 [1] CRAN (R 4.2.0)
#> generics 0.1.3 2022-07-05 [1] CRAN (R 4.2.0)
#> ggplot2 * 3.4.4 2023-10-12 [1] CRAN (R 4.2.0)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0)
#> gtable 0.3.4 2023-08-21 [1] CRAN (R 4.2.0)
#> hms 1.1.3 2023-03-21 [1] CRAN (R 4.2.0)
#> htmltools 0.5.6.1 2023-10-06 [1] CRAN (R 4.2.0)
#> htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.2.0)
#> httpuv 1.6.11 2023-05-11 [1] CRAN (R 4.2.0)
#> insight * 0.19.7 2023-11-26 [1] CRAN (R 4.2.1)
#> jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.2.0)
#> jsonlite 1.8.7 2023-06-29 [1] CRAN (R 4.2.0)
#> knitr 1.45 2023-10-30 [1] CRAN (R 4.2.1)
#> later 1.3.1 2023-05-02 [1] CRAN (R 4.2.0)
#> lattice 0.21-8 2023-04-05 [1] CRAN (R 4.2.0)
#> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.2.1)
#> lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.2.0)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0)
#> MASS 7.3-60 2023-05-04 [1] CRAN (R 4.2.0)
#> Matrix 1.5-4.1 2023-05-18 [1] CRAN (R 4.2.0)
#> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.2.0)
#> mime 0.12 2021-09-28 [1] CRAN (R 4.2.0)
#> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.2.0)
#> modelbased * 0.8.6 2023-01-13 [1] CRAN (R 4.2.1)
#> multcomp 1.4-25 2023-06-20 [1] CRAN (R 4.2.0)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 4.2.0)
#> mvtnorm 1.2-2 2023-06-08 [1] CRAN (R 4.2.0)
#> parameters * 0.21.3 2023-11-02 [1] CRAN (R 4.2.1)
#> performance * 0.10.8 2023-10-30 [1] CRAN (R 4.2.1)
#> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.2.0)
#> pkgbuild 1.4.0 2022-11-27 [1] CRAN (R 4.2.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.2.0)
#> pkgload 1.3.2.1 2023-07-08 [1] CRAN (R 4.2.0)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.2.0)
#> processx 3.8.2 2023-06-30 [1] CRAN (R 4.2.0)
#> profvis 0.3.8 2023-05-02 [1] CRAN (R 4.2.0)
#> promises 1.2.1 2023-08-10 [1] CRAN (R 4.2.0)
#> ps 1.7.5 2023-04-18 [1] CRAN (R 4.2.0)
#> purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.2.0)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.2.0)
#> Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.2.0)
#> readr * 2.1.4 2023-02-10 [1] CRAN (R 4.2.0)
#> remotes 2.4.2.1 2023-07-18 [1] CRAN (R 4.2.0)
#> report * 0.5.8 2023-12-07 [1] CRAN (R 4.2.1)
#> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.2.0)
#> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.2.0)
#> rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.2.0)
#> sandwich 3.0-2 2022-06-15 [1] CRAN (R 4.2.0)
#> sass 0.4.7 2023-07-15 [1] CRAN (R 4.2.0)
#> scales 1.2.1 2022-08-20 [1] CRAN (R 4.2.0)
#> see * 0.8.1 2023-11-03 [1] CRAN (R 4.2.1)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
#> shiny 1.8.0 2023-11-17 [1] CRAN (R 4.2.1)
#> stringi 1.7.12 2023-01-11 [1] CRAN (R 4.2.0)
#> stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.2.1)
#> survival 3.5-5 2023-03-12 [1] CRAN (R 4.2.0)
#> TH.data 1.1-2 2023-04-17 [1] CRAN (R 4.2.0)
#> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.2.0)
#> tidyr * 1.3.0 2023-01-24 [1] CRAN (R 4.2.0)
#> tidyselect 1.2.0 2022-10-10 [1] CRAN (R 4.2.0)
#> tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.2.0)
#> timechange 0.2.0 2023-01-11 [1] CRAN (R 4.2.0)
#> tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.2.0)
#> urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.2.0)
#> usethis 2.2.2 2023-07-06 [1] CRAN (R 4.2.0)
#> utf8 1.2.3 2023-01-31 [1] CRAN (R 4.2.0)
#> vctrs 0.6.4 2023-10-12 [1] CRAN (R 4.2.0)
#> withr 2.5.2 2023-10-30 [1] CRAN (R 4.2.1)
#> xfun 0.40 2023-08-09 [1] CRAN (R 4.2.0)
#> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.2.0)
#> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.2.0)
#> zoo 1.8-12 2023-04-13 [1] CRAN (R 4.2.0)
#>
#> [1] /Users/sebastiansaueruser/Rlibs
#> [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#>
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────