25 April 2018

Some instances of apparent duplicate publication by Dr. Robert J. Sternberg

Dr. Robert J. Sternberg is a past president of the American Psychological Association, currently at Cornell University, with a CV that is over 100 pages long [PDF] and, according to Google Scholar, almost 150,000 citations.

Recently, some people have been complaining that too many of those are self-citations, leading to a formal petition to the APS Publication CommitteeBut sometimes, it seems, Dr. Sternberg prefers to make productive use of his previous work in a more direct manner.  I was recently contacted by Brendan O'Connor, a graduate student at the University of Leicester, who had noticed that some of the text in Dr. Sternberg's many articles and chapters appeared to be almost identical. It seems that he may be on to something.

Exhibit 1


Brendan—who clearly has a promising career ahead of him as a data thug, should he choose that line of work—noticed that this 2010 article by Dr. Sternberg was basically a mashup of this article of his from the same year and this book chapter of his from 2002. One of the very few meaningful differences in the chunks that were recycled between the two 2010 articles is that the term "school psychology" is used in the mashup article to replace "cognitive education" from the other; this may perhaps not be unrelated to the fact that the former was published in School Psychology International (SPI) and the latter in the Journal of Cognitive Education and Psychology (JCEP). If you want to see just how much of the SPI article was recycled from the other two sources, have a look at this. Yellow highlighted text is copied verbatim from the 2002 chapter, green from the JCEP article. You can see that about 95% of the text is in one or the other colour:
Curiously, despite Dr. Sternberg's considerable appetite for self-citation (there are 26 citations of his own chapters or articles, plus 1 of a chapter in a book that he edited, in the JCEP article; 25 plus 5 in the SPI article), neither of the 2010 articles cites the other, even as "in press" or "manuscript under review"; nor does either of them cite the 2002 book chapter. If previously published work is so good that you want to copy big chunks from it, why would you not also cite it?

Exhibit 2


Inspired by Brendan's discovery, I decided to see if I could find any more examples. I downloaded Dr. Sternberg's CV and selected a couple of articles at random, then spent a few minutes googling some sentences that looked like the kind of generic observations that an author in search of making "efficient" use of his time might want to re-use.  On about the third attempt, after less than ten minutes of looking, I found a pair of articles, from 2003 and 2004, by Dr. Sternberg and Dr. Elena Grigorenko, with considerable overlaps in their text. About 60% of the text in the later article (which is about the general school student population) has been recycled from the earlier one (which is about gifted children), as you can see here (2003 on the left, 2004 on the right). The little blue paragraph in the 2004 article has also come from another source; see exhibit 4.
Neither of these articles cites the other, even as "in press" or "manuscript in preparation".

Exhibit 3


I wondered whether some of the text that was shared between the above pair of articles might have been used in other publications as well. It didn't take long(*) to find Dr. Sternberg's contribution (chapter 6) to this 2012 book, in which the vast majority of the text (around 85%, I estimate) has been assembled almost entirely from previous publications: chapter 11 of this 1990 book by Dr. Sternberg (blue), this 1998 chapter by Dr. Janet Davidson and Dr. Sternberg (green), the above-mentioned 2003 article by Dr. Sternberg and Dr. Grigorenko (yellow), and chapter 10 of this 2010 book by Dr. Sternberg, Dr. Linda Jarvin, and Dr. Grigorenko (pink).

Once again, despite the fact that this chapter cites 59 of Dr. Sternberg's own publications and another 10 chapters by other people in books that he (co-)edited, none of those citations are to the four works that were the source of all the highlighted text in the above illustration.

Now, sometimes one finds book chapters that are based on previous work. In such cases, it is the usual practice to include a note to that effect. And indeed, two chapters (numbered 26 and 27) in that 2012 book edited by Dr. Dawn Flanagan and Dr. Patti Harrison, contain an acknowledgement along the lines of "This chapter is adapted from <reference>.  Copyright 20xx by <publisher>.  Adapted by permission". But there is no such disclosure in chapter 6.

Exhibit 4


It appears that Dr. Sternberg has assembled a chapter almost entirely from previous work on more than one occasion. Here's a recent example of a chapter made principally from his earlier publications. About 80% of the words have been recycled from chapter 9 of this 2011 book by Dr. Sternberg, Dr. Jarvin, and Dr. Grigorenko (yellow), chapter 2 of this 2003 book by Dr. Sternberg, (blue; this is also the source of the blue paragraph in Exhibit 2), chapter 1 of this 2002 book by Drs Sternberg and Grigorenko (green), the 2012 chapter(**) mentioned in Exhibit 3 above (pink), and a wafer-thin slice from chapter 2 (contributed by Dr. Sternberg) of this 2008 book (purple).

This chapter cites 50 of Dr. Sternberg's own publications and another 7 chapters by others in books that he (co-)edited. This time, one of the citations was for one of the five books that were the basis of the highlighted text in the above illustration, namely the 2003 book Wisdom, Intelligence, and Creativity Synthesized that was the source of the blue text. However, none of the citations of that book indicate that any of the text taken from it is being correctly quoted, with quote marks (or appropriate indentation) and a page number. The four other books from which the highlighted text was taken were not cited. No disclosure that this chapter has been adapted from previously published material appears in the chapter, or anywhere else in the 2017 book (or, indeed, in the first edition of the book from 2005, where a similar chapter by Dr. Sternberg was published).

Why this might be a problem (other than for the obvious reasons)


There are a lot of reasons why this sort of thing is not great for science, and I suspect that there will be quite a lot of discussion about the meta-scientific, moral, and perhaps even legal aspects (I seem to recall that when I publish something, I generally have to sign my copyright over to someone, which means I can't go round distributing it as I like, and I certainly can't sign the copyright of the same text over to a second publisher). But I also want to make a point about how, even if the copying process itself does no apparent direct harm, this practice can damage the process of scientific inquiry.

During a number of the copy-and-paste operations that were apparently performed, a few words were sometimes changed. In some cases this was merely cosmetic (e.g., "participants" being changed to "students"), or a reflection of changing norms over time. But in other cases it seemed that the paragraphs being copied were merely being repurposed to describe a different construct that, while perhaps being in some ways analogous to the previous one, was not the same.  For example, the 2017 chapter that is the subject of Exhibit 4 above contains this sentence:

"In each case, important kinds of developing competencies for life were not adequately reflected by the kinds of competencies measured by the conventional ability tests" (p. 12).

But if we go to yet another chapter by Dr. Sternberg, this time from 2002, that contains mostly the same text (tracing all of the places in which a particular set of paragraphs have been recycled turns out to be computationally intensive for the human brain), we find:

"In each case, important kinds of developing expertise for life were not adequately reflected by the kinds of expertise measured by the conventional ability tests" (p. 21).

Are we sure that "competencies" are the same thing as "expertise"? How about "school psychology" and "cognitive education", as in the titles of the articles in Exhibit 1? Are these concepts really so similar that one can recycle, verbatim, hundreds of words at a time about one of them and be sure that all of those words, and the empirical observations that they sometimes describe, are equally applicable to both? And if so, why bother to have the two concepts at all?

Relatedly, the single biggest source of words for exhibit 3
—published in 2012—was a chapter published in 1990. Can it really be the case that so little has been discovered in 22 years in research into the nature of intelligence that this material doesn't even merit rewriting from a retrospective viewpoint?

What next?


I'm not sure, frankly. But James Heathers has some thoughts here.




(*) Brendan and I are looking for other similar examples to the ones described in this post. Given how easy it was to find these ones, we suspect that there may be more to be uncovered.

(**) While searching, I lost track of the number of times that the descriptions of the Rainbow and Kaleidoscope projects have been recycled across multiple publications. Citing the copy from the 2012 article seemed like an appropriate way to convey the continuity of the problem. For some reason, though, in this version from 2005, the number of students included in the sample was 777, instead of the 793 reported everywhere else.


14 comments:

  1. Impressive. It's amazing how careers can benefit from text recycling.
    Text recycling puts a different perspective on the phrase "cumulative evidence". The practice is much more widely spread in some research fields than in others. Psychology and especially economics seem to have high levels of text recycling, especially among 'productive' authors.
    (See our paper on the extent of recycling in Horbach & Halffman in Research Policy, https://doi.org/10.1016/j.respol.2017.09.004)

    ReplyDelete
  2. Off topic, have you looked into the PACE Trial a study published in the lancet featuring outcome switching and other flaws which has not been retracted, but should be. Do you have any thoughts on it?

    ReplyDelete
  3. Did you use software to do this? If so, what was the name of the program?

    ReplyDelete
    Replies
    1. I didn't use any software (other than Google to look for the duplicate text and a PDF print driver to make the multi-page images). I don't think Brendan did for Exhibit 1, either (or he would hopefully have told me!).

      There are online tools that claim to do this, but I don't know how good they are. I tried one (draftable.com) on a couple of articles by Sternberg that had one almost-duplicated paragraph, and it didn't spot it, so presumably the slight differences were enough to throw it off the scent. Google is quite nice for this as it doesn't insist on perfect matches if you don't put the search string in quotes.

      Delete
    2. I had a student program this little tool on the basis of Dick Grune's sim_text: https://people.f4.htw-berlin.de/~weberwu/simtexter/app.html You put one text on each side and compare them, the identical text is colored automatically, and clicking on one bit brings up the parallel in the other text. Of course, you still have to find the sources, but it is a big help in documenting text parallel.

      Delete
    3. Apparently that tool can't read PDF files... :-(

      Delete
  4. Very interesting! I wonder if "edit distance" functions (i.e., Levenshtein distance) in R could be useful. Essentially, the method calculates how many letters would have to be switched/deleted/replaced to get a match. You could probably use the R functions with words instead of letters, without too much hassle, and possibly automate the process.
    https://cran.r-project.org/web/packages/stringdist/stringdist.pdf

    A super fancy version could also consider synonyms as identical, but that probably isn't necessary based on how successful you've been doing this "by hand".

    ReplyDelete
  5. In Germany, a few years back software was developed and used in a series of investigations into (mostly) politicians' dissertations. See http://de.guttenplag.wikia.com/wiki/Tools,_um_Plagiate_aufzuspüren?li_source=LI&li_medium=wikia-footer-wiki-rec fo a list of services and tools Discussion at http://de.vroniplag.wikia.com/
    Best wishes, Ulf

    ReplyDelete
    Replies
    1. Sorry, I was part of that effort (and the documentation of currently 199 other dissertations and habilitations at http://de.vroniplag.wikia.com/wiki/Home). Neither GuttenPlag Wiki nor VroniPlag Wiki is software! This was done by hand by researchers who could google and compare texts. There are tools available to document aspects of plagiarism, but finding it is very difficult, as there is an open world of potential sources.

      Delete
  6. Great work, by James Heathers too. I have written about more extensive problems in Sternberg's work, some of it involving his citation practices. It took months to disentangle what was going on in the matter I investigated, which is why no one had attempted it before but so many were grateful to see in print. Here are two articles. See also the article by Nathan Brody published together with the first one below.
    Gottfredson, L. S. (2003). Dissecting practical intelligence theory: Its claims and evidence. Intelligence, 31(4), 343-397.
    Gottfredson, L. S. (2003). On Sternberg's "Reply to Gottfredson." Intelligence, 31(4), 415-424.
    To download copies, go to http://www1.udel.edu/educ/gottfredson/reprints/#Publications

    For a short version, see this book review, also on my website.
    Gottfredson, L. S. (2001). Review of Practical Intelligence in Everyday Life by R. J. Sternberg et al. Intelligence, 29, 363-365.

    ReplyDelete
  7. I made a very mild comment on Sternberg's editorship of The Cambridge Handbook of Intelligence 2011, where I felt that self-citation had distorted the balanced depiction of psychometric research. http://www.unz.com/jthompson/a-handbook-of-intelligence/

    ReplyDelete
  8. The book by Professor Alice Sterling Honig 'Literacy Story-telling and Bilingualism in Asian Classrooms'
    (2018) on Amazon contains a collection of 10 published articles from the Journal ‘Early Child Development and Care (paywalled). The book is published by Routledge and the Journal was published by Taylor and Francis.
    https://www.amazon.com/Literacy-Storytelling-Bilingualism-Asian-Classrooms/dp/1138502626

    Does this fall under text recycling?

    Chapter 7 is contributed by Noel Chia Kok Hwee and Norman Kee Kiak Nam: 'Gender Differences in the reading process of six-year-olds' 2013.

    This article has been retracted, due to the non-availability of the data;
    retractiondatabase.org/RetractionSearch.aspx#?auth%3dChia%252c%2bNoel%2bKok%2bHwee
    Noel Chia Kok Hwee resigned in March 2016 and more than 20 of his articles and 20 books have been retracted due to scientific misconduct.
    In Singapore, the book is available in the National Library (2016):
    www.nlb.gov.sg/biblio/202480138

    ReplyDelete
  9. Thank you, Nick, for taking the time to put this together.

    ReplyDelete