Chapter 8 Interactions

Two predictors interact when you need to know values of both in order to make an accurate prediction of the response variable value.

Predictors can interact in any type of regression model (so this chapter could really be placed almost anywhere).

8.1 Example: Quantitative-Categorical Interaction

gf_point(eval ~ beauty, color = ~female, shape = ~female, data = teach_beauty) %>%
  gf_lm()

Eval may go up as beauty increases, but the slope of the relationship is different for females and non-females. This is an interaction between beauty and female.

8.2 Categorical-Categorical Interaction Example

gf_boxplot(eval ~ formal | female, data = teach_beauty)

Perhaps Informal Dress affects eval scores, but really only for non-females – for females, formal dress doesn’t make a difference either way.

The effect of formal dress is different depending on the value of female. This is an interaction between formal and female.

8.3 Quant-Quant interactions?

Yes, these are possible, but very hard to visualize and conceptualize. Basically, it would mean that the slope of the line for one predictor changes gradually as the value of a second variable changes.

8.4 R code

If you want to include an interaction term in a model in R, use a * rather than a + between the predictors that (may) interact. For example, based on our exploration above, we might try:

beauty_mod <- lm(eval ~ beauty*female +
                   formal*female, 
                 data = teach_beauty, 
                 na.action = 'na.fail')
summary(beauty_mod)

## 
## Call:
## lm(formula = eval ~ beauty * female + formal * female, data = teach_beauty, 
##     na.action = "na.fail")
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.8342 -0.3676  0.0497  0.3979  1.0716 
## 
## Coefficients:
##                                       Estimate Std. Error t value Pr(>|t|)
## (Intercept)                             3.8484     0.1081   35.59   <2e-16
## beauty                                  0.0902     0.0474    1.90    0.058
## femalenot female                        0.2761     0.1313    2.10    0.036
## formalInformal Dress                    0.0575     0.1157    0.50    0.620
## beauty:femalenot female                 0.1084     0.0645    1.68    0.094
## femalenot female:formalInformal Dress  -0.0838     0.1428   -0.59    0.558
##                                          
## (Intercept)                           ***
## beauty                                .  
## femalenot female                      *  
## formalInformal Dress                     
## beauty:femalenot female               .  
## femalenot female:formalInformal Dress    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.537 on 457 degrees of freedom
## Multiple R-squared:  0.0733, Adjusted R-squared:  0.0631 
## F-statistic: 7.23 on 5 and 457 DF,  p-value: 1.59e-06

Notice the additional indicator variables in the coefficient table/model equation. Now we need to adjust the effects of the beauty predictor depending on the values of formal and female, which interact with it.

We can use IC-based model selection to determine whether including these interactions in a model is important or not.

require(MuMIn)
dredge(beauty_mod, rank = 'AIC')

## Global model call: lm(formula = eval ~ beauty * female + formal * female, data = teach_beauty, 
##     na.action = "na.fail")
## ---
## Model selection table 
##    (Int)    bty fml frm bty:fml fml:frm df logLik AIC delta weight
## 12  3.90 0.0876   +           +          5   -366 743  0.00  0.414
## 4   3.90 0.1486   +                      4   -368 744  1.12  0.236
## 16  3.90 0.0877   +   +       +          6   -366 745  2.00  0.152
## 8   3.90 0.1486   +   +                  5   -368 746  3.12  0.087
## 32  3.85 0.0902   +   +       +       +  7   -366 746  3.65  0.067
## 24  3.83 0.1488   +   +               +  6   -368 747  4.50  0.044
## 2   4.01 0.1330                          3   -375 757 14.03  0.000
## 6   4.03 0.1317       +                  4   -375 758 15.88  0.000
## 3   3.90          +                      3   -379 763 20.39  0.000
## 7   3.93          +   +                  4   -378 765 22.12  0.000
## 23  3.87          +   +               +  5   -378 766 23.59  0.000
## 1   4.00                                 2   -384 772 28.88  0.000
## 5   4.04              +                  3   -383 773 30.24  0.000
## Models ranked by AIC(x)

In the case of the particular model we fitted, the “best” model starting from this full model is actually one without interactions. If you want to explore the dataset further, you will find that actually a model where age, beauty AND female interact fits much better…

8.5 Cautionary note

If you include an interaction in a regression model, you must also include the corresponding “fixed effects” – this means if you have an indicator variable/slope term for an interaction in your model, you must also have the indicator variables/slopes corresponding to the individual predictors. Our fitting functions (lm(), glm(), glmmTMB(), etc.) are smart enough to ensure this for you. So is dredge(). (It would take effort to mess this up in R.)