24 May 2020

The Silence of the RIOs

Just over a month ago, I published these two blog posts. After the first, Daniël Lakens tweeted this:
I thought that was a good idea, so I set out to find who the "university ethics person" might be for the 15 co-authors of the article in question. (I wrote directly and separately to the two PhD supervisors of the lead author, as it is he who appears to be prima facie responsible for most of its deficiencies; I also wrote to Nature Scientific Reports outlining my concerns about the article. In both cases I received a serious reply indicating that they were concerned about the situation.)

It turns out that finding the address of the person to whom complaints about research integrity at a university or other institution is not always easy. There were only one or two cases where I was able to do this by following links from the institution's web site, as regular readers of xkcd might have been able to guess. In a few cases I used Google with the site: option to find a person. But about half the time, I couldn't identify anyone. In those cases I looked for the e-mail address of someone who might be the dean or head of department of the author concerned. Hilariously, in one case, the author was the head of department and I ended up writing to the president of the university.

Anyway, by 24 April 2020 I had what looked like a plausible address at all 12 institutions, so I sent this e-mail.
From: Nicholas Brown <nicholas.brown@lnu.se>
Sent: 24 April 2020 16:04
To: [9 people]
Subject: Possible scientific misconduct in an article published in Nature Scientific Reports
First, allow me to apologise if I have addressed this e-mail to any of you in error, and also if my use of the phrase "Research Integrity Officer" in the above salutation is not an accurate summary of your job title. I had some difficulty in establishing, from your institution's web site, who was the correct person to write to for questions of research integrity in many cases, including [list]. In those cases I attempted to identify somebody who appears to have a senior function in the relevant department. In the case of [institution], I only found a general contact address --- I am trying to reach someone who might have responsibility for the ethical conduct of "XXX" in the XXX Department.
I am writing to bring your attention to these [sic; I started drafting the e-mail before I wrote the second post, and not everything about it evolved correctly after that] blog posts, which I published on April 21, 2020: https://steamtraen.blogspot.com/2020/04/some-issues-in-recent-gaming-research.html.
At least one author of the scientific article that is the principal subject of that blog post (Etindele Sosso et al., 2020; https://doi.org/10.1038/s41598-020-58462-0, published on 2020-02-06 in Nature Scientific Reports) lists your institution as their affiliation. 
While my phrasing in that public blog post (and a follow-up, which is now linked from the first post) was necessarily conservative, I think it is clear to anyone with even a minimum of relevant scientific training who reads it that there is strong prima facie evidence that the results of the Etindele Sosso et al. (2020) article have been falsified, and perhaps even fabricated entirely. Yet, 15 other scholars, including at least one at your institution (in the absence of errors of interpretation on my part) signed up to be co-authors of this article.
There would seem to be two possibilities in the case of each author.
1. They knew, or should have known, that the reported results were essentially impossible. (Even the Abstract contains claims about the percentage of variance explained by the main independent variable that are utterly implausible on their face.)
2. They did not read the manuscript at all before it was submitted to a Nature group journal, despite the fact that their name is listed as a co-author and included in the "Author contributions" section as having, at least, "contributed to the writing".
It seems to me that either of these constitutes a form of academic misconduct. If these researchers knew that the results were impossible, they are culpable in the publication of falsified results. If they are not --- that is, their defence is that they did not read and understand the implications of the results, even in the Abstract --- then they have made inappropriate claims of authorship (in a journal whose own web site states that it is the 11th most highly cited in the world). Either of these would surely be likely to bring your institution into disrepute.
For your information, I intend to make this e-mail public 30 days from today, accompanied by a one-sentence summary (without, as far as possible, revealing any details that might be damaging to the interests of anyone involved) of your respective institutions' responses until that point. I would hope that, despite the difficult circumstances under which we are all working at the moment, it ought to be able to at least give a commitment to thoroughly investigate a matter of this importance within a month. I mention this because in previous cases where I have made reports of this kind, the modal response from institutional research integrity officers has been no response at all.
Of course, whatever subsequent action you might decide to take in this matter is entirely up to you.
Kind regards,
Nicholas J L Brown, PhD
Linnaeus University
The last-but-one paragraph of that e-mail mentions that, 30 days from the date of the e-mail, I intended to make it public, along with a brief summary of the responses from each institution. The e-mail is above. Here is how each institution responded:

Nottingham Trent University, Nottingham, UK: Stated that they would investigate, and gave me an approximate date by which they anticipated that their investigation would be complete.
Central Queensland University, Rockhampton, Australia: Stated that they would investigate, but with no estimate of how long this would take.
Autonomous University of Nuevo Leon, Monterrey, N.L., Mexico: No reply.
Jamia Millia Islamia, New Delhi, India: No reply.
University of L’Aquila, L’Aquila, Italy: No reply.
Army Share Fund Hospital, Athens, Greece: No reply.
Université de Montréal, Montréal, Québec, Canada: No reply.
University of Limerick, Limerick, Ireland: No reply.
Lero Irish Software Research Centre, Limerick, Ireland: No reply.

By "No reply" here, I mean that I received nothing. No "Undeliverable" message. No out-of-office message. No quick reply saying "Sorry, COVID-19 happened, we're busy". Not "We'll look into it". Not "We won't look into it". Not even "Get lost, there is clearly no case to answer here". Nothing, nada, nichts, rien, zip, in reply to what I (and, apparently, the research integrity people at the two institutions that did reply) think is a polite, professional e-mail, with a subject line that I hope suggests that a couple of minutes of the recipient's time might be a worthwhile investment, in 7 out of 9 cases.

I find this disappointing. I wish I could say that I found it remotely surprising. Maybe I should just be grateful that Daniël's estimate of one institution taking any sort of action was exceeded by 100%.


16 May 2020

The perils of improvising with linear regression: Stedman et al. (in press)

This article has been getting a lot of coverage of various kinds in the last few days, including the regional, UK national, and international news media:

Stedman, M., Davies, M., Lunt, M., Verma, A., Anderson, S. G., & Heald, A. H. (in press). A phased approach to unlocking during the COVID-19 pandemic – Lessons from trend analysis. International Journal of Clinical Practice. https://doi.org/10.1111/ijcp.13528

There doesn't seem to be a typeset version from the journal yet, but you can read the final draft version online here and also download it as a PDF file.

The basic premise of the article is that, according to the authors' model, COVID-19 infections are far more widespread in the community in the United Kingdom(*) than anyone seems to think. Their reasoning works in three stages. First, they built a linear model of the spread of the disease, one of whose predictors was the currently reported number of total cases (i.e., the official number of people who have tested positive for COVID-19). Second, they extrapolated that model to a situation in which the entire population was infected, assuming the the spread continues to be entirely linear. Third, they used the slope of their line to estimate what the official reported number of cases would be at that point. They concluded their model shows that the true number of cases in the population is 150 times larger than the number of positive tests that have been carried out, so that on the day when their data were collected (24 April 2020) 26.8% of the population were already infected.

The above figures are from the Results section of the paper, on p. 11 of the final draft PDF. However, the Abstract contains different numbers, which seem to be based on data from 19 April 2020. The Abstract asserts that the true number of cases in the population may be 237 (versus 150) times the reported number, and that the percentage of the population who had been infected might have been 29% (versus 26.8% several days later). Aside from the question of why the Abstract includes different principal results from the Results section, it would appear to be something of a problem for the authors' assumptions of (ongoing) linearity in the relation between the spread of the disease and the number of reported cases if the slope of their model changed by a factor of one-third over five days.

But it seems to me that, apart from the rather tenuous assumptions of linearity and the validity of extrapolating a considerable way beyond the range of the data (which, to be fair, they mention in their Limitations paragraph), there is an even more fundamental problem with how the authors have used linear regression here. Their regression model contained at least nine covariates(**), and we are told that "The stepwise(***) regression of the local UTLA factors to RADIR showed that only one factor total reported cases/1,000 population [sic] was significantly linked" (p. 11). I take this to mean that, if the authors had reported the regression output in a table, this predictor would be the only one whose absolute value was at least twice its standard error. (The article is remarkably short on numerical detail, with neither a table of regression output nor a table of descriptives and correlations of the variables. Indeed, there are many points in the article where a couple of minutes spent on attention to detail would have greatly improved its quality, even in the context of the understandable desire to communicate results rapidly during a pandemic.)

Having established that only one predictor in this ten-predictor regression was statistically significant (in the sense of a 95-year-old throwaway remark by R. A. Fisher), the authors then proceeded to do something remarkable. Remember, they had built this model:

Y = B0 + B1X1 + B2X2 + B3X3 + ... + B10X10 + Error

(with Error apparently representing 78% of the variance, cf. line 3 of p. 11). But they then dropped ten of those terms (nine regression coefficients multiplied by the values of the predictors, plus the error term) to come up with this model (p. 11):

RADIR = 1.06 - 0.16 x Current Total Cases/1,000

What seems to have happened here is that authors in effect decided to set the regression coefficients B2 through B10 to zero, apparently because their respective values were less than twice their standard errors and so "didn't count" somehow. However, they retained the intercept (B0) and the coefficient associated with their main variable of interest (B1) from the 10-variable regression, as if the presence of the nine covariates had had no effect on the calculation of these values. But of course, those covariates had had an effect on both the estimation of the intercept and the coefficient of the first variable. That was precisely what they were included in the regression for. If the authors had wanted to make a model with just one predictor (the number of current total cases), they could have done so quite simply a one-variable regression. You can't just run a multiple regression and keep the coefficients (with or without the intercept) that you think are important while throwing away the others.

This seems to me to be a rather severe misinterpretation of what a regression model is and how it works. There are many other things that could be questioned about this study(****), and indeed several people are already doing precisely that, but this seems to me to a very fundamental problem with the article, and something that the reviewers really ought to have picked up. The first two authors appear to be management consultants whose qualifications to conduct this sort of analysis are unclear, but the third author's faculty page suggests that he knows his stuff when it comes to statistics, so I'm not sure how this was allowed to happen.

Stedman et al. end with this sentence: "The manuscript is an honest, accurate, and transparent account of the study being reported. No important aspects of the study have been omitted." This is admirable, and I take it to mean that the authors did not in fact run a one-predictor regression to estimate the effect of their main IV of interest on their DV before they decided to run a stepwise regression with nine covariates. However, I suggest that it might be useful if they were to run that one-predictor regression now, and report the results along with those of the multiple regression (cf. Simmons, Nelson, & Simonsohn, 2011, p. 1362, Table 2, point 6). When they do that, they might also consider incorporating the latest testing data and see if the slope of their regression has changed, because since 24 April the number of cases in the UK has more than doubled (240,161 at the moment I am writing this), suggesting that between 54% and 84% of the population has by now become infected, depending on whether we take the numbers from p. 11 of the article or those from the Abstract.

[[ Update 2020-05-18 00:15 UTC: There is a preprint of this paper, available here. It contains the same basic model, which is estimated using data from 11 days earlier than the accepted manuscript. In the preprint, the regression equation (p. 7) is:

RADIR = 1.20 - 0.26 x Current Total Cases/1,000

In other words, between the submission date of the preprint and the preparation data of the final manuscript, the slope of the regression line --- which the model assumes would be constant until everyone was infected --- changed from -0.26 to -0.16. And yet the authors did not apparently think that this was sufficient reason to question the idea that the progress of the disease would continue to match their linear model, despite direct evidence that it had failed to do so over the previous 11 days. This is quite astonishing. ]]

(*) Whether the authors claim that their model applies to the UK or just to England is not entirely clear, as both terms appears to be used more or less interchangeably. They use a figure of 60 million as the population of England, although the Office of National Statistics reports figures of 56.3 million for England and 66.8 million for the UK in mid 2019.

(**) I wrote here that there were nine, but one of them is ethnicity, which would typically have been coded as a series of categories, each of which would have functioned as a separate predictor in the regression. But maybe they used some kind of shorthand such as "Percentage who did/didn't identify in the 'White British' category", so I'll continue to assume that there were nine covariates and hence 10 predictors in total.

(***) Stepwise regression is generally not considered a good idea these days. See for example here. Thanks to Stuart Ritchie for this tip and for reading through a draft of this post.

(****) Looking at Figure 4, it occurs to me that a few data points at the top left might well be lifting the left-hand side of the regression line up to some extent, but that's all moot until we know more about the single-variable regression. Also, there is no confidence interval or any other measure of the uncertainty --- even assuming that the model is perfectly linear --- of the estimated reported case rate when the infection rate drops to zero.