# Setup

```
library(tidyverse)
data(mtcars)
```

# Using a log-Y regression

Gelman et al., in “Regression and other stories” are stating that “when additivity and linearity are not reasonable assumptions” it may sense to “take the logarithms of outcomes that are all positive” (p. 189).

A log-y regression can be defined as follows, in the simplest case:

\[\text{log} \, y = b_0 + b_1X_1 + \ldots + \epsilon\] Exponentiating both sides yields

\[y = e^{b_0 + b_1X_1 + \ldots + \epsilon}\]

This is a multiplicative model as can be seen perhaps more lucidly in this notation:

\[y = e^{b0} \cdot e^{b1} \cdot \ldots \cdot e^{\epsilon}\]

In essence, we can make use uf log-y regression for modelling
*multiplicative* assocations between x and y.

A multiplicative associations also means that it is not *linear*.

Let’s have a look at an example.

```
m1 <- lm(log(mpg) ~ hp, data = mtcars)
coef(m1)
```

```
## (Intercept) hp
## 3.460466874 -0.003428734
```

The coefficent of the predictor is reported as -0.0034287. Exponentiating this values yields approx.

```
b1_approx <- coef(m1)[2] %>% exp()
b1_approx
```

```
## hp
## 0.9965771
```

Thus, one more unit in \(X\) increases \(Y\) by exp(b1).
In this case, `exp(1)`

amounts to 0.9965771.
Of course, factors below one indicate a decrease in \(Y\).

In this example, we find a decrease of approx. 0.035% change in y associated with a one unit change in x.

Let’s have a look at the predictions:

```
pred_grid <-
tibble(
hp = 0:400,
pred = predict(m1, tibble(hp))
) %>%
mutate(pred_exp = exp(pred))
```

```
ggplot(mtcars) +
aes(x = hp, y = mpg) +
geom_point() +
geom_line(data = pred_grid,
aes(x = hp, y = pred_exp))
```

There’s a useful shortcut to interpreting log-y regression coefficients.

Compare this following (small) figures and their respective exponentiated values:

```
dat <-
tibble(
xs = seq(-.5, .5, by = .1),
xs_exp = exp(xs)
)
```

```
ggplot(dat) +
aes(x = xs, y = xs_exp) +
geom_line() +
geom_abline(slope = 1, intercept = 1)
```