Consistency of set.seed across different systems

1 Load packages

library(tidyverse)  # data wrangling
library(digest)
library(rstanarm)

2 Motivation

Reproducibility of results is a major concern in science and industry alike. However, there are numerous pitfalls which may threaten reproducibility. This post explores one possible issue - the fixation of start values of random numbers drawn by R.

Reasons that seeds in R across different systems may (possibly) vary (I don’t know if they do, but I heard so) may include:

  • different OSes
  • different version of OSes
  • different versions of R

3 User error

Maybe the most straight forward explanation is a user error. Don’t forget to call set.seed() right before the calculation. Be sure to use a pre defined seed value.

4 Your help needed

Please let me know whether (or not) you get the same random numbers, once the seed is fixed. In order to check the possible influence source, please note your OS, and the R version (or paste the session info).

Thanks!

5 Same random numbers

What set.seed() does is the fix the value of .Random.seed, which is in turn used to initiate the random number generator of R.

.Random.seed %>% head()
#> [1]      10403        624 -709689667  906139226 -133236717 1588108856

5.1 Without seed

Without seed, the random numbers drawn by R will be transient, ie., subject of change:

rnorm(1)
#> [1] -1.051114
replicate(n = 10, expr = rnorm(1))
#>  [1]  0.01176695 -1.21472604  1.86946816  0.72725355 -2.53246585 -0.17330070
#>  [7] -1.61645738  0.44063028  0.73718828 -1.89312377

5.2 With seed

set.seed(42)

rnorm(1)
#> [1] 1.370958
set.seed(42)
replicate(n = 10, expr = rnorm(1))
#>  [1]  1.37095845 -0.56469817  0.36312841  0.63286260  0.40426832 -0.10612452
#>  [7]  1.51152200 -0.09465904  2.01842371 -0.06271410

5.3 Using a hash

Let’s use a hash number to control the output more succinctly:

set.seed(42)
x <- replicate(n = 10, expr = rnorm(1))

digest(x)
#> [1] "8cbf5bafbba3d3c63d39c586f29b3a3d"

Let’s recompute and compare the hash values:

set.seed(42)
x2 <- replicate(n = 10, expr = rnorm(1))

digest(x2)
#> [1] "8cbf5bafbba3d3c63d39c586f29b3a3d"
identical(x, x2)
#> [1] TRUE

Identical, as expected.

6 Seeds in regression models

6.1 lm

set.seed(42)
lm1 <- lm(mpg ~ hp, data = mtcars)
lm1_coef <- coef(lm1)
lm1_coef
#> (Intercept)          hp 
#> 30.09886054 -0.06822828

Rerun to double check:

set.seed(42)
lm1 <- lm(mpg ~ hp, data = mtcars)
lm1_coef <- coef(lm1)
lm1_coef
#> (Intercept)          hp 
#> 30.09886054 -0.06822828

6.2 Stan mtcars

lm2 <- stan_glm(mpg ~ hp, data = mtcars, seed = 42, refresh = 0)
lm2_coef <- coef(lm2)
lm2_coef
#> (Intercept)          hp 
#> 30.11668130 -0.06820988

Rerun:

lm2 <- stan_glm(mpg ~ hp, data = mtcars, seed = 42, refresh = 0)
lm2_coef <- coef(lm2)
lm2_coef
#> (Intercept)          hp 
#> 30.11668130 -0.06820988

6.3 Stan penguins

Import data and run the model:

penguins <- read.csv("https://vincentarelbundock.github.io/Rdatasets/csv/palmerpenguins/penguins.csv")

library(rstanarm)
lm3 <- stan_glm(body_mass_g ~ bill_length_mm, 
                seed = 42,
                refresh = 0,
                data = penguins)
coef(lm3)
#>    (Intercept) bill_length_mm 
#>      356.57465       87.45281

By the way:

Copy-paste the chunk above (which is self-containing in terms of data and R packages) and then type reprex::reprex() in the console (you need to have the package reprex installed). You’ll get a reproducible example on your clipboard.

Rerun:

lm3 <- stan_glm(body_mass_g ~ bill_length_mm, 
                seed = 42,
                refresh = 0,
                data = penguins)
coef(lm3)
#>    (Intercept) bill_length_mm 
#>      356.57465       87.45281

7 Session info

#> ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.2.1 (2022-06-23)
#>  os       macOS Big Sur ... 10.16
#>  system   x86_64, darwin17.0
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Berlin
#>  date     2023-01-18
#>  pandoc   2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
#>  package       * version  date (UTC) lib source
#>  assertthat      0.2.1    2019-03-21 [1] CRAN (R 4.2.0)
#>  backports       1.4.1    2021-12-13 [1] CRAN (R 4.2.0)
#>  base64enc       0.1-3    2015-07-28 [1] CRAN (R 4.2.0)
#>  bayesplot       1.10.0   2022-11-16 [1] CRAN (R 4.2.0)
#>  blogdown        1.16     2022-12-13 [1] CRAN (R 4.2.0)
#>  bookdown        0.31     2022-12-13 [1] CRAN (R 4.2.0)
#>  boot            1.3-28.1 2022-11-22 [1] CRAN (R 4.2.0)
#>  broom           1.0.2    2022-12-15 [1] CRAN (R 4.2.0)
#>  bslib           0.4.2    2022-12-16 [1] CRAN (R 4.2.0)
#>  cachem          1.0.6    2021-08-19 [1] CRAN (R 4.2.0)
#>  callr           3.7.3    2022-11-02 [1] CRAN (R 4.2.0)
#>  cellranger      1.1.0    2016-07-27 [1] CRAN (R 4.2.0)
#>  cli             3.6.0    2023-01-09 [1] CRAN (R 4.2.0)
#>  codetools       0.2-18   2020-11-04 [2] CRAN (R 4.2.1)
#>  colorout      * 1.2-2    2022-06-13 [1] local
#>  colorspace      2.0-3    2022-02-21 [1] CRAN (R 4.2.0)
#>  colourpicker    1.2.0    2022-10-28 [1] CRAN (R 4.2.0)
#>  crayon          1.5.2    2022-09-29 [1] CRAN (R 4.2.1)
#>  crosstalk       1.2.0    2021-11-04 [1] CRAN (R 4.2.0)
#>  DBI             1.1.3    2022-06-18 [1] CRAN (R 4.2.0)
#>  dbplyr          2.2.1    2022-06-27 [1] CRAN (R 4.2.0)
#>  devtools        2.4.5    2022-10-11 [1] CRAN (R 4.2.1)
#>  digest        * 0.6.31   2022-12-11 [1] CRAN (R 4.2.0)
#>  dplyr         * 1.0.10   2022-09-01 [1] CRAN (R 4.2.0)
#>  DT              0.26     2022-10-19 [1] CRAN (R 4.2.0)
#>  dygraphs        1.1.1.6  2018-07-11 [1] CRAN (R 4.2.0)
#>  ellipsis        0.3.2    2021-04-29 [1] CRAN (R 4.2.0)
#>  evaluate        0.19     2022-12-13 [1] CRAN (R 4.2.0)
#>  fansi           1.0.3    2022-03-24 [1] CRAN (R 4.2.0)
#>  fastmap         1.1.0    2021-01-25 [1] CRAN (R 4.2.0)
#>  forcats       * 0.5.2    2022-08-19 [1] CRAN (R 4.2.0)
#>  fs              1.5.2    2021-12-08 [1] CRAN (R 4.2.0)
#>  gargle          1.2.1    2022-09-08 [1] CRAN (R 4.2.0)
#>  generics        0.1.3    2022-07-05 [1] CRAN (R 4.2.0)
#>  ggplot2       * 3.4.0    2022-11-04 [1] CRAN (R 4.2.0)
#>  glue            1.6.2    2022-02-24 [1] CRAN (R 4.2.0)
#>  googledrive     2.0.0    2021-07-08 [1] CRAN (R 4.2.0)
#>  googlesheets4   1.0.1    2022-08-13 [1] CRAN (R 4.2.0)
#>  gridExtra       2.3      2017-09-09 [1] CRAN (R 4.2.0)
#>  gtable          0.3.1    2022-09-01 [1] CRAN (R 4.2.0)
#>  gtools          3.9.4    2022-11-27 [1] CRAN (R 4.2.0)
#>  haven           2.5.1    2022-08-22 [1] CRAN (R 4.2.0)
#>  hms             1.1.2    2022-08-19 [1] CRAN (R 4.2.0)
#>  htmltools       0.5.4    2022-12-07 [1] CRAN (R 4.2.0)
#>  htmlwidgets     1.6.1    2023-01-07 [1] CRAN (R 4.2.0)
#>  httpuv          1.6.8    2023-01-12 [1] CRAN (R 4.2.0)
#>  httr            1.4.4    2022-08-17 [1] CRAN (R 4.2.0)
#>  igraph          1.3.5    2022-09-22 [1] CRAN (R 4.2.0)
#>  inline          0.3.19   2021-05-31 [1] CRAN (R 4.2.0)
#>  jquerylib       0.1.4    2021-04-26 [1] CRAN (R 4.2.0)
#>  jsonlite        1.8.4    2022-12-06 [1] CRAN (R 4.2.0)
#>  knitr           1.41     2022-11-18 [1] CRAN (R 4.2.0)
#>  later           1.3.0    2021-08-18 [1] CRAN (R 4.2.0)
#>  lattice         0.20-45  2021-09-22 [2] CRAN (R 4.2.1)
#>  lifecycle       1.0.3    2022-10-07 [1] CRAN (R 4.2.0)
#>  lme4            1.1-31   2022-11-01 [1] CRAN (R 4.2.0)
#>  loo             2.5.1    2022-03-24 [1] CRAN (R 4.2.0)
#>  lubridate       1.9.0    2022-11-06 [1] CRAN (R 4.2.0)
#>  magrittr        2.0.3    2022-03-30 [1] CRAN (R 4.2.0)
#>  markdown        1.4      2022-11-16 [1] CRAN (R 4.2.0)
#>  MASS            7.3-58.1 2022-08-03 [1] CRAN (R 4.2.0)
#>  Matrix          1.5-3    2022-11-11 [1] CRAN (R 4.2.0)
#>  matrixStats     0.63.0   2022-11-18 [1] CRAN (R 4.2.0)
#>  memoise         2.0.1    2021-11-26 [1] CRAN (R 4.2.0)
#>  mime            0.12     2021-09-28 [1] CRAN (R 4.2.0)
#>  miniUI          0.1.1.1  2018-05-18 [1] CRAN (R 4.2.0)
#>  minqa           1.2.5    2022-10-19 [1] CRAN (R 4.2.1)
#>  modelr          0.1.10   2022-11-11 [1] CRAN (R 4.2.0)
#>  munsell         0.5.0    2018-06-12 [1] CRAN (R 4.2.0)
#>  nlme            3.1-161  2022-12-15 [1] CRAN (R 4.2.0)
#>  nloptr          2.0.3    2022-05-26 [1] CRAN (R 4.2.0)
#>  pillar          1.8.1    2022-08-19 [1] CRAN (R 4.2.0)
#>  pkgbuild        1.4.0    2022-11-27 [1] CRAN (R 4.2.0)
#>  pkgconfig       2.0.3    2019-09-22 [1] CRAN (R 4.2.0)
#>  pkgload         1.3.2    2022-11-16 [1] CRAN (R 4.2.0)
#>  plyr            1.8.8    2022-11-11 [1] CRAN (R 4.2.0)
#>  prettyunits     1.1.1    2020-01-24 [1] CRAN (R 4.2.0)
#>  processx        3.8.0    2022-10-26 [1] CRAN (R 4.2.0)
#>  profvis         0.3.7    2020-11-02 [1] CRAN (R 4.2.0)
#>  promises        1.2.0.1  2021-02-11 [1] CRAN (R 4.2.0)
#>  ps              1.7.2    2022-10-26 [1] CRAN (R 4.2.0)
#>  purrr         * 1.0.1    2023-01-10 [1] CRAN (R 4.2.0)
#>  R6              2.5.1    2021-08-19 [1] CRAN (R 4.2.0)
#>  Rcpp          * 1.0.9    2022-07-08 [1] CRAN (R 4.2.0)
#>  RcppParallel    5.1.6    2023-01-09 [1] CRAN (R 4.2.0)
#>  readr         * 2.1.3    2022-10-01 [1] CRAN (R 4.2.0)
#>  readxl          1.4.1    2022-08-17 [1] CRAN (R 4.2.0)
#>  remotes         2.4.2    2021-11-30 [1] CRAN (R 4.2.0)
#>  reprex          2.0.2    2022-08-17 [1] CRAN (R 4.2.0)
#>  reshape2        1.4.4    2020-04-09 [1] CRAN (R 4.2.0)
#>  rlang           1.0.6    2022-09-24 [1] CRAN (R 4.2.0)
#>  rmarkdown       2.19     2022-12-15 [1] CRAN (R 4.2.0)
#>  rstan           2.21.7   2022-09-08 [1] CRAN (R 4.2.0)
#>  rstanarm      * 2.21.3   2022-04-09 [1] CRAN (R 4.2.0)
#>  rstantools      2.2.0    2022-04-08 [1] CRAN (R 4.2.0)
#>  rstudioapi      0.14     2022-08-22 [1] CRAN (R 4.2.0)
#>  rvest           1.0.3    2022-08-19 [1] CRAN (R 4.2.0)
#>  sass            0.4.4    2022-11-24 [1] CRAN (R 4.2.0)
#>  scales          1.2.1    2022-08-20 [1] CRAN (R 4.2.0)
#>  sessioninfo     1.2.2    2021-12-06 [1] CRAN (R 4.2.0)
#>  shiny           1.7.4    2022-12-15 [1] CRAN (R 4.2.0)
#>  shinyjs         2.1.0    2021-12-23 [1] CRAN (R 4.2.0)
#>  shinystan       2.6.0    2022-03-03 [1] CRAN (R 4.2.0)
#>  shinythemes     1.2.0    2021-01-25 [1] CRAN (R 4.2.0)
#>  StanHeaders     2.21.0-7 2020-12-17 [1] CRAN (R 4.2.0)
#>  stringi         1.7.12   2023-01-11 [1] CRAN (R 4.2.0)
#>  stringr       * 1.5.0    2022-12-02 [1] CRAN (R 4.2.0)
#>  survival        3.5-0    2023-01-09 [1] CRAN (R 4.2.0)
#>  threejs         0.3.3    2020-01-21 [1] CRAN (R 4.2.0)
#>  tibble        * 3.1.8    2022-07-22 [1] CRAN (R 4.2.0)
#>  tidyr         * 1.2.1    2022-09-08 [1] CRAN (R 4.2.0)
#>  tidyselect      1.2.0    2022-10-10 [1] CRAN (R 4.2.0)
#>  tidyverse     * 1.3.2    2022-07-18 [1] CRAN (R 4.2.0)
#>  timechange      0.2.0    2023-01-11 [1] CRAN (R 4.2.0)
#>  tzdb            0.3.0    2022-03-28 [1] CRAN (R 4.2.0)
#>  urlchecker      1.0.1    2021-11-30 [1] CRAN (R 4.2.0)
#>  usethis         2.1.6    2022-05-25 [1] CRAN (R 4.2.0)
#>  utf8            1.2.2    2021-07-24 [1] CRAN (R 4.2.0)
#>  vctrs           0.5.1    2022-11-16 [1] CRAN (R 4.2.0)
#>  withr           2.5.0    2022-03-03 [1] CRAN (R 4.2.0)
#>  xfun            0.36     2022-12-21 [1] CRAN (R 4.2.0)
#>  xml2            1.3.3    2021-11-30 [1] CRAN (R 4.2.0)
#>  xtable          1.8-4    2019-04-21 [1] CRAN (R 4.2.0)
#>  xts             0.12.2   2022-10-16 [1] CRAN (R 4.2.0)
#>  yaml            2.3.6    2022-10-18 [1] CRAN (R 4.2.0)
#>  zoo             1.8-11   2022-09-17 [1] CRAN (R 4.2.0)
#> 
#>  [1] /Users/sebastiansaueruser/Rlibs
#>  [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────