This post is about some issues in Study 3 of the following article:

*Journal of Experimental Psychology: General*,

*148*(9), 1557–1574. https://doi.org/10.1037/xge0000639

*t*test from the study followed by the ANCOVAs to test whether gender moderated the relationship between condition and emotional accuracy.

*dishonest*condition (

*M*= 1.58,

*SD*= 0.63) were significantly less

accurate at detecting others’ mental and affective states than those in the

*honest*condition (

*M*=1.39,

*SD*= 0.54),

*t*(209) = 2.37,

*p*= .019", p. 1564, emphasis in original). My R code gave me this result:

`> t.test(df.repl.dis$EmoAcc, df.repl.hon$EmoAcc, var.equal=TRUE)`

Two Sample t-test

data: df.repl.dis$EmoAcc and df.repl.hon$EmoAcc

t = 2.369, df = 209, p-value = 0.01875

*t*test.

`> t.test(df.full.dis$EmoAcc, df.full.hon$EmoAcc, var.equal=TRUE)`

Two Sample t-test

data: df.full.dis$EmoAcc and df.full.hon$EmoAcc

t = 0.20148, df = 246, p-value = 0.8405

*t*test are only 246 rather than the expected 248. On inspection of the dataset, it appears that one record has NA for experimental condition and another has NA for emotional accuracy. Both of these records were also manually excluded by the research assistants, but they could not have been used in any of the

*t*tests anyway. Hence, it seems fairer to say that 37 out of 248 participant pairs, rather than 39 out of 250, were excluded based on notes made by the RAs.)

*t*test (

*p*= 0.8405 versus

*p*= 0.01875). Had these 37 participant pairs not been excluded, there would be no difference between the conditions; put another way, the exclusions drive the entire effect. I ran the same

*t*test on (only) these excluded participants:

> t.test(df.dis$EmoAcc, df.hon$EmoAcc, var.equal=TRUE)

Two Sample t-test

data: df.exconly.dis$EmoAcc and df.exconly.hon$EmoAcc

t = -4.1645, df = 35, p-value = 0.0001935

*d*for this test is 1.412, which is a very large effect indeed among people who are not paying attention.

*t*tests:

*t*(34) = 4.56", reflecting a

*t*test with equal variances not assumed in which the calculated degrees of freedom were 34.466. This is actually the more correct way to calculate the

*t*statistic, but I have been using "equal variances assumed" in all of the other analyses in this post for compatibility with the original article, which used analyses from SPSS in which the assumption of equal variances is the default. See also this article. ]]

*in the opposite direction to the hypothesis*(as shown by the negative

*t*statistic). That is, if these results are to be believed, something about the fact that either A or B was not following the study instructions made A much better (or less bad) at determining B's emotions when telling a fake (versus true) story. There were 14 excluded participant pairs in the "dishonest" condition, with a mean emotional accuracy score (lower = more accurate) for A of 1.143, and 23 in the "honest" condition, with a mean emotional accuracy score of 2.076; for comparison, the mean score for the full sample across both conditions is 1.523.

*p*values less than 0.05 for the

*t*test on the resulting sample of 211 pairs. The smallest

*p*value that I obtained was 0.03387, which is higher than the one reported in the article. To put it another way, out of a million attempts I was unable to obtain even one result as extreme as the published one by chance.