Suppose you have a large number of columns of a dataframe, and you want to plot each column – say a histogram for each column.

This post shows some ways of achieving this.

Let’s take the mtcars dataset as an example.

`data(mtcars)`

We will use the tidyverse approach:

`library(tidyverse)`

# Way 1

```
mtcars %>%
select_if(is_numeric) %>%
map2(., names(.), ~ {ggplot(data = data_frame(.x),
aes(x = .x)) +
geom_histogram() +
labs(x= .y)})
#> $mpg
#>
#> $cyl
#>
#> $disp
#>
#> $hp
#>
#> $drat
#>
#> $wt
#>
#> $qsec
#>
#> $vs
#>
#> $am
#>
#> $gear
#>
#> $carb
```

Some explanations:

- First, we take the dataset
`mtcars`

. - Then, we map a function (ie.,
`ggplot()`

) to each column of`mtcars`

, but we also parse the names of`mtcars`

. `ggplot()`

likes dataframes, but`map()`

serves lists/vectors, so we have to enshrine each vector to a dataframe using`data_frame()`

.- The data comes from the first list (
`mtcars`

), that’s where`.x`

comes from (or refers to). - The names come from the second lsit (
`names(mtcars)`

), that’s where`.y`

points to.

# Way 2

A maybe more simple is this:

```
mtcars %>%
gather(key = item, value = value) %>%
ggplot() +
aes(x = value) +
geom_density() +
facet_wrap(~ item, ncol = 2, scales = "free")
```

# Test if column is normally distributed before doing anything else

Suppose we want to check whether a column is nicely normally distributed before plotting. That’s one way to checking that:

```
mtcars %>%
map(~ shapiro.test(.x)) %>%
map("p.value") %>%
keep(. > .05)
#> $mpg
#> [1] 0.1228814
#>
#> $drat
#> [1] 0.1100608
#>
#> $wt
#> [1] 0.09265499
#>
#> $qsec
#> [1] 0.5935176
```