Part 1 of an occasional series on the Empirical Retraction Lit bibliography
In 2020, my team released the first version of the Empirical Retraction Lit bibliography, updated a number of times. The last updates are July 2021 (content); September 2021 (taxonomy); December 2022 (JSON/web design giving access to the taxonomy).
The bibliography is part of my Alfred P. Sloan-funded project, Reducing the Inadvertent Spread of Retracted Science, and it has also been an avenue for me to experiment with literature review automation and bibliography web tools. Since August members of my lab have been writing up a review on post-retraction citation, building on work a number of people have done on the review over the past several years. To update the content, we’re also working on a systematic search and screening process.
I expect a substantial number of new items. In July 2021 we had 385 items. After that I’d been estimating perhaps 7 new papers a month, which would mean ~175 new items July 2021-August 2023 (since our systematic search was September 5, 2023). That ballpark number seems plausible now that I’m in the middle of full-text screening. 2+ years is a very long time in retraction research, especially with the attention retraction has been receiving in the past few years!
A number of questions arise in trying to optimize a scoping review update process. Here are just a few:
- Is the search I used last time adequate? Should it be updated or changed in any way?
- Is date-based truncation appropriate for my search?
- Is it appropriate to exclude certain items from the initial search (e.g., data, preprints)?
- Is there a high-precision way to exclude retraction notices and retracted publications when the database indexing is insufficient?
- Could citation-based searching in one or several sources replace multi-database searching on this topic? What are its precision and recall?
- Are there additional databases that should be added to the search?
- Is additional author-based searching relevant?
- What is the most painless and effective way to deduplicate items? (Complicated in my case by retraction notices; retracted publications; and non-English language items that have multiple translations.)
- Which items without abstracts may be relevant in this topic?
- What is the shortest item that can make a research contribution in this topic?
- Is original, “empirical” research a clear and appropriate scope?
Ideally the Empirical Retraction Lit bibliography will become a living systematic review that relies on as much automation and as little human effort as appropriate, with an eye towards monthly updates. My experimentation with existing automation tools makes this plausible. Routinely looking at a few papers a month seems feasible as well, especially since I could repurpose the time I spend in ad-hoc tracking of the literature, which has missed a number of items compared to systematic searching (even some high-profile items in English!).
Automation is also becoming more palatable now that I’ve found errors from the laborious human-led review: at least 2 older, in-scope items that were not included in the July 2021 version of the review, presumably because they were poorly indexed at the time of previous searches; and an informally published item that appears to have been erroneously excluded, presumably due to confusion that we were only excluding data and preprints when the bibliography included a related item.
Of course, for the website, there are a number of open questions:
- How can the bibliography website be made useful for authors, reviewers, and editors? Awareness of related publications becomes even more important because the retraction literature is very scattered and diffuse!
- How can the complex taxonomy of topics be made clear?
- Would “suggest a publication” be a useful addition?
My aims in writing are:
- To share my experience about the tools and processes I’m using.
- To documenting the errors I make and the problems I have. This:
- will remind me in the future (so I make different mistakes or try different tools).
- can inform tool development.
- can inspire systematic analysis of pragmatic information retrieval.
- To identify possible collaborators for the Empirical Retraction Lit bibliography scoping review and website; for finalizing the post-retraction citation review; and for writing other reviews from the scoping review.
- To solicit feedback from various communities working on pragmatic information retrieval, systematic review automation, retraction, library technology, scholarly publishing metadata workflows, literature review methodology, publication ethics, science of science,… more generally.