Skip to main content

Review of "Massive covidization of research citations and the citation elite"

Published onMar 27, 2022
Review of "Massive covidization of research citations and the citation elite"

This paper shows how citations given in 2021 and 2022 are strongly skewed toward COVID-19 research, a phenomenon that the authors refer to as “covidization” of citations. Based on the statistics presented in the paper, this terminology does indeed seem justified. The authors for instance show that 20% of the citations received by publications from 2021 and 2022 were received by COVID-19 publications, even though these publications represent only 4% of the total number of publications in 2021 and 2022.

I found the paper interesting to read. My comments are provided below.

Preprints played an important role in the pandemic. However, it is not clear how they were handled by the authors in their analyses. On the one hand, the authors state that they “included preprint publications from ArXiv, SSRN, BioRxiv, ChemRxiv and medRxiv”. On the other hand, “citations from or to preprints were not included in any of the counts”. This is confusing. Why include preprints if their citations are excluded, while the entire paper is about citations? The authors need to clarify this issue and they need to motivate the approach they took for handling preprints.

“As shown, the number of authors who received >=100 citations to very recent work was almost double for COVID-19 work than for non-COVID-19 work”: The results in Table 2 seem to show it is the other way around.

“Almost all of the scientists who had received >10,000 citations to their very recent work were from China.”: Author name disambiguation is relatively difficult for Chinese authors. Could this observation be the result of disambiguation mistakes?

“Five scientists improved their ranking more than 6-fold”: Could you give more information about these five scientists? It would be helpful to know a bit more about their area of expertise and about the contribution they made to COVID-19 research. I also wonder whether these five scientists all worked independently from each other, or whether some of them were collaborators.

“who went from rank 48045 in 2019 to rank 362 in 2020, a 13-fold improvement”: Why is this a 13-fold improvement? I would interpret this as a 48045 / 362 = 133 fold improvement.

“Overall, 143 of the top-200 ranked scientists across science”: In the preceding paragraph, the focus was on the top 300 scientists. Why not consistently focus on either the top 200 or the top 300?

The section “improved overall citation ranking for scientists with influential COVID-19 work” shows that the citation impact of some researchers increased strongly as a result of their COVID-19 work. This is not very surprising. I would find it more interesting to see an analysis of the extent to which the pandemic led to more substantial changes in the elite of top-cited researchers compared to the changes observed in normal non-pandemic years. For instance, suppose we consider the elite of the top N (e.g., N = 100, 1000, or 10000) highest citation impact researchers in the periods 2016-2017, 2018-2019, and 2020-2021 and we determine the overlap of the elites in the first and the second period and the overlap of the elites in the second and the third period. I would like to see a comparison of the overlap for the first and the second period with the overlap for the second and the third period. If the latter overlap is smaller than the former overlap, this provides evidence that the pandemic led to abnormal changes in the elite of top-cited researchers.

I am not sure what conclusions can be drawn from the correlation coefficients reported in the section “Correlation between metrics of impact: career impact and 2020-2021 work”. Summarizing a complex relationship between two variables (i.e., two metrics) in a single correlation coefficient offers limited information. In addition, the Pearson correlation has the problem that it may be strongly influenced by a few extreme values. The plots presented in Figure 3 are also of limited value, since there are too many data points in these plots, making it difficult to see clear patterns. The authors should try to perform the analysis presented in this section in a more insightful way. For instance, they could bin researchers based on one metric and then present a plot that shows for each bin the mean/median value (as well as some measure of dispersion) for the other metric.

Regarding data sharing, the authors state that “all the key data are in the manuscript”. I don’t agree. To develop an in-depth understanding of changes in the elite of top-cited researchers, it is important to have access to data at the level of individual researchers. I therefore would like to ask the authors to make available the researcher-level data behind their analyses (including the names of researchers).

No comments here
Why not start the discussion?