In the past couple of years I have
reviewed half a dozen manuscripts with abstracts that go something like this:
<Construct
X> is known to be associated with higher levels of well-being and healthy
psychological functioning, as indexed by <Construct Y>. However, to date,
no study has investigated the role of <Construct M> in this association.
The present study bridges this gap by testing a mediation path model in a
sample of undergraduates (N = 100).
As predicted, M fully mediated the positive association between X and Y. These results suggest that X predicts higher levels
of M, which subsequently predicts higher levels of Y. These results provide new
insight that may advance a coherent theoretical framework on the pathways by
which M enhances psychological well-being.
There is typically a description of
how the 100 participants completed measures of constructs X, M, and Y, with a
table of correlations that might look like this:
X
Y
Y .24*
M .52*** .32**
* p < .05; ** p < .01;
*** p < .001.
Then we get to the mediation
analysis. More often than not this is done using the PROCESS macro in SPSS, but
it can also be done “by hand” using a few ordinary least-squares regressions. Here
are the steps required (cf. Baron & Kenny, 1986):
- Show that X is a significant predictor of Y. You probably don’t actually need to do the regression for this, as the standardized regression coefficient and its associated p value will be identical to the correlation coefficient between X and Y, but sometimes the manuscript will show the SPSS output to prove that the authors conducted this regression anyway. (In the last manuscript that I reviewed, the authors performed the single-predictor regression and managed to obtain a standardized regression coefficient that was different to the zero-order correlation, which did not enhance my confidence in the rest of their analyses.) Here the p value will be .016.
- Show that X is a significant predictor of M. Again, no regression is required for this, as it’s just the correlation coefficient. The p value in this example is about 3E−8.
- Regress Y on both X and M. If you get a significant regression coefficient for M then you have at least a “partial” mediation effect. If, in addition, the regression coefficient for X is non-significant then you have “full" mediation. Here, this produces the following standardized coefficients:
- M: β = 0.268, p = .018
- X: β = 0.101, p = .368
Ta-da!
In this example, we have complete mediation: The p value for the mediator, M, is significant and the p value for X isn’t. We conclude that Construct
M fully mediates the relation between Construct X and Construct Y. We write
it up and celebrate our fine contribution to understanding
the mechanisms that lead to well-being. Surely the end of mental distress is only one more grant away.
The problem is this: Absolutely any
other variable that you might put in place of M, and which is correlated in the
same way with X and Y, will also show exactly
the same mediation effect. And there is no shortage of things you can
measure—in psychology, at least—that are correlated at around .5 and .3 with two other variables, themselves intercorrelated at around .2, that
you might have measured. Let’s say that X is some aspect of socioeconomic status and Y is
subjective well-being. You can easily come up with any number of ideas for M:
gratitude, optimism, self-esteem, all of the Big Five personality traits (if
you reverse-score neuroticism as emotional stability), etc., without even
needing to resort to Lykken and Meehl’s
“crud factor” (“in psychology and sociology everything correlates with everything”; Meehl, 1990, p. 204). Does it make sense for multiple third variables all to apparently fully mediate the relation between a predictor and an outcome variable?
I wrote some R code, which you can find here, to demonstrate
the example that I gave above.
You will see that I performed the calculations in two ways. The first was to
generate (with a bit of trial and error) some random data with the correct
correlations. (This produces a bit of rounding error, so the p value for the beta for M in the
regression is reported at .019, not .018.) The second—my preferred method,
since you can generally use this starting with the table of descriptives that
appears in an article—is to start with the correlations and perform the
regression calculations from there. (A surprising number of people do not seem
to know that you can generally determine the standardized coefficients of multiple
regression models just from the correlation table. The standard errors—and, hence,
the p values—can then be derived from
the sample size. If you have the standard deviations as well, you can get the unstandardized coefficients. Add in the means and
you can calculate the intercepts too. Again, this can all be done from the descriptive statistics, which is probably why the complete table of descriptives and correlations used to be standard in every paper. You don't need the raw data for any of this.)
If your initial choice for variable
X is more strongly correlated with Y than M is, then you can very often just swap
M and Y around, because there is typically nothing to say that whatever X is
measuring occurs “before” whatever M is measuring, or vice versa—especially if you just hauled a bunch of undergraduates in and gave them measures of their current levels of X, M, and Y to complete. The reason
why you want your mediator, M, to be more strongly correlated than X with the
outcome (Y), is a little-known phenomenon of two-variable regression that I
like to think of as a sort of “Matthew effect”. Feel free to skip the next
paragraph in which I explain this in tedious detail.
When the two predictors are
moderately strongly correlated with each other (.52, in our case), then
although their zero-order correlations with the outcome variable might be quite
close together (.32 and .24 here), their standardized regression coefficients will diverge
by quite a bit more than their respective correlation
coefficients. Here, M’s correlation of .32 led to a beta of 0.268, which is a
16% reduction, but X’s correlation of .24 was reduced by 58% to a beta of
0.101. If the correlation between M and X had been a little higher (eg, .60 instead of .52), the beta
for M would actually have been larger (0.275) and the beta for X would have been even smaller (0.075). At
some point along the M–X correlation continuum (around .75), the beta for M
would be exactly equal to the correlation coefficient of M with Y (as if X wasn't in the regression model at all), and the beta
for X would be zero. Continuing even further, we would hit “negative
suppression” territory, with M’s standardized regression coefficient being greater than the original correlation
coefficient of .32, and X’s standardized regression coefficient being negative. Many people seem to have a rather naïve
view of multiple regression in which the addition of a new predictor results in the betas
for all of the predictors being reduced in some roughly equal proportion, but the
reality is often nothing like that. You can explore what happens with just two
predictors (with more, things get even wilder) here using my Shiny app.
So it’s possible to build an almost
infinite number of mediation studies, all of which will appear to tell us
something about the mediation of the relation between two psychological variables by a third, although almost all of them are just
illustrating a known phenomenon of multiple regression. Again, everything is
determined by the three correlations between the variables, plus the sample
size if you care about statistical significance. (Alert readers will have
noticed that whether or not mediation is “full” or “partial” will depend to a
large extent on the sample size; with enough participants even the residual
effect of X on Y will be large enough that its p value doesn’t drop below .05.
But of course, alert readers will also know that these days statistical
significance doesn’t mean very much on its own, right?)
Now, am I saying that all of the
mediation articles that I get to review are based on an atheoretical “throw
some numbers at the wall and see what sticks” approach, which might be an implication of what I have argued here? Well, no... but I’m also not saying that that never happens. I have heard first-hand from grad students, in several cases, what
happens when they have a bunch of variables and no obvious result: Their
supervisor suggests that they write them up as a mediation analysis.
I don’t think that preregistration
will necessarily help all that much here, because it is quite predictable from
previous knowledge that X, M, and Y will have the pattern of correlations
needed to produce an apparent mediation effect. I’m going to suggest that the
only solution is to refrain from doing this kind of mediation analysis altogether
in the absence of (a) much better theoretical justification than we currently see, and (b) some kind of constraint on the temporal order in which changes in X, M, and Y occur.
Without a demonstration that the causal arrows are running from X to M and M to Y (MacKinnon & Pirlott, 2014), and not vice versa, we have no
way of knowing whether we are dealing with mediation or confounding,
especially since in many cases the constructs X, M, and Y may themselves be
caused by multiple other factors, and so ad infinitum (cf. Arah, 2008). In the absence of experimental manipulation, causality is hard to demonstrate, especially in psychology.
References
Arah, O. A. (2008). The role of causal reasoning in understanding Simpson's
paradox, Lord's paradox, and the suppression effect: Covariate selection in the
analysis of observational studies. Emerging
Themes in Epidemiology, 5, 5.
https://doi.org/10.1186/1742-7622-5-5
Baron, R. M., & Kenny, D.
A. (1986). The moderator–mediator variable distinction in social
psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology,
51(6), 1173–1182. https://doi.org/10.1037/0022-3514.51.6.1173
MacKinnon, D. P., & Pirlott, A. G. (2014). Statistical approaches for enhancing causal interpretation of the M to Y relation in mediation analysis. Personality and Social Psychology Review, 19(1), 30–43. https://doi.org/10.1177/1088868314542878
Meehl, P. E. (1990). Why summaries of research on psychological theories are often uninterpretable. Psychological Reports, 66(1), 195–244. https://doi.org/10.2466/pr0.1990.66.1.195
(Thanks to Julia Rohrer for her helpful comments on an earlier draft of this post. If the whole thing is garbage, it's probably because I didn't incorporate more of her thoughts.)
Hi Nick,
ReplyDeleteThank you for this blog post, which highlights one of the most important issues in mediation testing. I have to admit that I also have been guilty of publishing several mediation analyses based on cross-sectional self-report data. I guess that researchers are just not aware of this problem unless they dive deeper into this topic.
I just wanted to add some notes:
Researchers who use PROCESS actually shouldn't write about complete vs. partial mediation. PROCESS estimates the indirect effect as the product of the a and b path and this indirect effect is either signifcant or not. Thus, there is no place for the concept of partial mediation in the framework by Hayes and others. Also, there can be an indirect effect in addition to, or in the absence of, a direct and a total effect. So, the significance of the single paths actually doesn't matter much, that is, significance of the single paths are not a prerequisite for finding a mediation effect (but, of course, as you write it is very likely to find one if all three variables are correlated with each other).
Unfortunately, it seems that most users of PROCESS haven't read Hayes' book, so they still use the outdated thinking when writing about their analyses (https://open.lnu.se/index.php/metapsychology/article/view/870)...
Hi Nick,
ReplyDeleteI've only recently stumbled across your blog, but spent the last days reading and enjoying it. Thanks a lot for all your effort going into this!