Harmony Square Activity

Setup

Assign roles (R coder, manager, chat-message-sender, graphics/notes reference…) in your group.

The Game

According to a recent paper in the Harvard Kennedy School Misinformation Review,

We present Harmony Square, a short, free-to-play online game in which players learn how political misinformation is produced and spread. We find that the game confers psychological resistance against manipulation techniques commonly used in political misinformation: players from around the world find social media content making use of these techniques significantly less reliable after playing, are more confident in their ability to spot such content, and less likely to report sharing it with others in their network.

You may have played the game in preparation for today’s class - if you didn’t maybe you’ll be inspired to play after (or if you have time at the end of class)!

Data

We have access to data from a research study with goals:

to find out whether people who play Harmony Square 1) find manipulative social media content less reliable after playing [Reliability score measures how reliable they thought content was]; 2) are more confident in their ability to spot such manipulative content [Confidence score measures their confidence]; and 3) are less likely to indicate that they are willing to share manipulative social media content in their network [Sharing score; higher means more likely to share], compared to a gamified control group (which played Tetris for around the same amount of time it takes to complete Harmony Square).

The code below reads in and wrangles the dataset for our analysis (parallel to, but not exactly the same as, that presented in the published paper). It’s shown for those who are interested but you need not be able to explain or replicate it.

harmony <- read_csv('https://osf.io/6va7x/download') %>%
  # reshape data so it has one row per trial per person
  pivot_longer(cols = starts_with(match = c('pre', 'post')),
               names_to = "measure",
               values_to = "score") %>%
  # get rid of some not-needed columns
  select(ProlificID:`Political_ideology-text`, measure, score) %>%
  # pull apart the column "measure" which has form "pre1_1" into 3 parts:
  # pre vs post; condition 1, 2, 3; and item number
  separate(col = measure,
           into = c("measure0", "Item"),
           sep = '_') %>%
  # create variable indicating whether pre or post play
  mutate(Time_Point = ifelse(grepl('post', measure0), 'post', 'pre'),
         # make sure Time_Point is factor
         Time_Point = factor(Time_Point, levels = c('pre', 'post')),
         # convert Measure from 1,2,3 to informative labels
         Measure = case_when(grepl('1', measure0) ~ 'Reliability',
                             grepl('2', measure0) ~ 'Confidence',
                             grepl('3', measure0) ~ 'Sharing')) %>%
  # group data by person, measure, time-point, condition
  group_by(Measure, Condition, ProlificID, Item) %>%
  arrange(Time_Point) %>%
  # compute difference for each Item for each person/measure/condition
  summarize(Score_Change = diff(score),
            Country = first(`Country-simplified`),
            Gender = first(Gender),
            Education = first(Education),
            Age = first(Age),
            News = first(`News-check`),
            Social_Media = first(`Social-media-use`),
            Political_Ideology = first(`Political_ideology-text`)) %>%
  ungroup() %>%
  group_by(Measure, Condition, ProlificID) %>%
  # compute mean score change (post-pre) for each person/timepoint/condition/measure; also keep individual specific covariates (age, etc.)
  summarize(Mean_Change = mean(Score_Change, na.rm = TRUE),
            Country = first(Country),
            Gender = first(Gender),
            Education = first(Education),
            Age = first(Age),
            News = first(News),
            Social_Media = first(Social_Media),
            Political_Ideology = first(Political_Ideology)) %>%
  ungroup()

Exploration

A graph is given to you that shows overall results: how does the game affect people’s ability to identify fake/manipulative posts (Reliability), their Confidence in that ability, and their propensity to share such posts (Sharing)?

gf_boxplot(Mean_Change ~ Measure | Condition, 
           color = ~Measure,
           data = harmony) %>%
  gf_facet_grid(~Condition) %>%
  gf_theme(legend.position = 'none') %>%
  gf_labs(y = 'Change in Score\n(Post minus Pre)')

Can you think of an alternative way of displaying this data that makes it easier to compare the score changes for Harmony Square vs. Control? (Make one alternative plot.)

Question 1: If possible, send a screen shot of your group’s alternative plot to Prof DR via chat message.

Effects of other covariates on scores?

If it’s already 12:55 or later, skip ahead to the Model section.

Make at least one more graph, to show how Gender, Education, Age, or Political_Ideology affect mean score-changes.

Question 2: If possible, send a screen shot of your group’s plot to Prof DR via chat message.

Model

I propose the following model for the mean score change:

harmony_mod <- glmmTMB(Mean_Change ~ 
                         Condition*Measure*Gender +
                         Condition*Measure*Education + 
                         Condition*Measure*Age + 
                         Condition*Measure*Political_Ideology + 
                         (1 | Country/ProlificID),
                       data = harmony,
                       control = glmmTMBControl(optCtrl = list(iter.max = 1000, eval.max = 5000)))
# uncomment if you want to see the (long) model summary)
# summary(harmony_mod)
car::Anova(harmony_mod)

## Registered S3 methods overwritten by 'car':
##   method                          from
##   influence.merMod                lme4
##   cooks.distance.influence.merMod lme4
##   dfbeta.influence.merMod         lme4
##   dfbetas.influence.merMod        lme4

## Analysis of Deviance Table (Type II Wald chisquare tests)
## 
## Response: Mean_Change
##                                         Chisq Df Pr(>Chisq)    
## Condition                             13.4822  1  0.0002408 ***
## Measure                              168.1769  2  < 2.2e-16 ***
## Gender                                 6.0894  2  0.0476104 *  
## Education                              3.4966  5  0.6238975    
## Age                                   11.2010  4  0.0243960 *  
## Political_Ideology                     0.5903  2  0.7444148    
## Condition:Measure                     98.9205  2  < 2.2e-16 ***
## Condition:Gender                       1.1548  2  0.5613494    
## Measure:Gender                         9.7159  4  0.0454956 *  
## Condition:Education                    4.2717  5  0.5110007    
## Measure:Education                     12.5162 10  0.2519936    
## Condition:Age                          7.6265  4  0.1062571    
## Measure:Age                            7.8201  8  0.4512397    
## Condition:Political_Ideology           0.9320  2  0.6275209    
## Measure:Political_Ideology             9.3250  4  0.0534705 .  
## Condition:Measure:Gender              12.4593  4  0.0142435 *  
## Condition:Measure:Education           17.3473 10  0.0670228 .  
## Condition:Measure:Age                 12.2517  8  0.1403221    
## Condition:Measure:Political_Ideology   3.6462  4  0.4559981    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Question 3: Explain in words: what does it mean for Condition and Measure to interact? It might help to write a sentence of the form: The score-change is different for people who ______ vs. _____, and the size and sign of the difference depends on ______. Also: Why is it essential to include this interaction in our model? Send your answers to Prof DR via chat.

Follow-up discussion question if you have lots of time: Explain what it means for Condition, Measure and Gender to ALL interact?

Which Predictors Matter?

Can you tell from the summary() and Anova() above which factors affect scores, and how? (If “not really” for the “how”: that is exactly what data graphs and prediction plots are for! We’ll make some in a sec.)

Based on the ANOVA, which predictors and interactions are important? (Remember, if a variable is part of an important interaction, it’s important even if there’s a large p-value for it-alone-not-in-the-interaction…)

Our data provides strong evidence that these predictors are strongly associated with scores:
- (add your list)
The data does not provide evidence that these predictors are associated with scores:
- (add your list)

Question 4: Send your list of “notable” predictors (ones for which we do have evidence of a relationship with scores) to Prof DR via chat.

Prediction Plots

A prediction plot for the main conclusion (effect of Harmony Square game on scores for each Condition) is below.

ggeffects::ggpredict(harmony_mod,
                     terms = c('Measure', 'Condition')) %>%
  plot() %>%
  gf_hline(yintercept = 0) %>%
  gf_labs(title = '',
          y = 'Score Difference (Post - Pre)') %>%
  gf_theme(legend.position = 'top')

Question 5: How does the Harmony Square game affect people’s Reliability, Confidence, and Sharing scores? Remember, a score difference of 0 means that the game-play had NO effect on the score; positive means post-play scores were HIGHER, and negative means post-play scores were LOWER. Send your answer via chat.

Question 6: Looking at the prediction plot you just made…what kind of predictions are these? (Choose one of the options below)

These are predicted scores of one “average person” participating in the study
These are the average scores you’d expect, if you had a study with lots of people in it

(Send your group’s answer to Prof DR via a chat message.)

Follow-up discussion (you don’t need to send an answer): does your group agree with me, that this kind of prediction makes sense to use here?

Question 7: The right-hand side of our regression model equation has 3 parts: A) intercepts/slopes and fixed effects; B) random effect terms; and C) residuals. Which forms of uncertainty are included in the pred plots we made?

Part A), uncertainty in the \(\beta\)s (slopes and intercepts from \(\beta_0 + \beta_1x_1 + \beta_2x_2\) etc.)
Parts A) and B): uncertainty in the \(\beta\)s and the random effect term(s) \(\epsilon_{RE} \sim N(0, \sigma_{RE})\)
Parts A), B), and C): \(\beta\)s and RE term(s) and residuals \(\epsilon_{resid} \sim N(0,\sigma_{resid})\)

Send your answer to prof DR via chat.

Your prediction plot

If you have time…

Make at least one more prediction plot, similar to the one above, that also includes the effect of a demographic predictor (age, gender, education, etc.).

Hint: just add your predictor of choice to the list of terms for ggpredict().

Write a brief interpretation of the prediction plot in a sentence or two.

What we skipped: Assessment

You are not required to go through this section, but may do so if you have time.

I left this step out today, because I wanted to focus on other parts. But we should, of course, have checked our L.I.N.E. conditions to verify that the model results are reliable. In this case, they look fine!

Suggestion: try to do model assessment yourself, and then compare against these results.

require(DHARMa)
harmony <- harmony %>%
  arrange(Country, ProlificID) %>%
  mutate(resids = resid(harmony_mod),
         # predictions for average individual 
         # this is what we'd want to use for most condition checking
         preds = predict(harmony_mod, re.form = ~0),
         # predictions for specific individuals observed in dataset
         ind_preds = predict(harmony_mod, re.form = NULL))

harmony_sim <- simulateResiduals(harmony_mod)
gf_point(harmony_sim$scaledResiduals ~ 
           preds, data = harmony,
         alpha = 0.2)

gf_histogram(~resids, data = harmony)

s245::gf_acf(~harmony_mod)

Harmony Square

STAT 245 Fall 2020

11/9/2020