Skip to main content

Review of "Journal impact factor and peer review thoroughness and helpfulness: A supervised machine learning study"

Published onSep 25, 2022
Review of "Journal impact factor and peer review thoroughness and helpfulness: A supervised machine learning study"
key-enterThis Pub is a Review of
Journal Impact Factor and Peer Review Thoroughness and Helpfulness: A Supervised Machine Learning Study
Journal Impact Factor and Peer Review Thoroughness and Helpfulness: A Supervised Machine Learning Study
Description

The journal impact factor (JIF) is often equated with journal quality and the quality of the peer review of the papers submitted to the journal. We examined the association between the content of peer review and JIF by analysing 10,000 peer review reports submitted to 1,644 medical and life sciences journals. Two researchers hand-coded a random sample of 2,000 sentences. We then trained machine learning models to classify all 187,240 sentences as contributing or not contributing to content categories. We examined the association between ten groups of journals defined by JIF deciles and the content of peer reviews using linear mixed-effects models, adjusting for the length of the review. The JIF ranged from 0.21 to 74.70. The length of peer reviews increased from the lowest (median number of words 185) to the JIF group (387 words). The proportion of sentences allocated to different content categories varied widely, even within JIF groups. For thoroughness, sentences on 'Materials and Methods' were more common in the highest JIF journals than in the lowest JIF group (difference of 7.8 percentage points; 95% CI 4.9 to 10.7%). The trend for 'Presentation and Reporting' went in the opposite direction, with the highest JIF journals giving less emphasis to such content (difference -8.9%; 95% CI -11.3 to -6.5%). For helpfulness, reviews for higher JIF journals devoted less attention to 'Suggestion and Solution' and provided fewer Examples than lower impact factor journals. No, or only small differences were evident for other content categories. In conclusion, peer review in journals with higher JIF tends to be more thorough in discussing the methods used but less helpful in terms of suggesting solutions and providing examples. Differences were modest and variability high, indicating that the JIF is a bad predictor for the quality of peer review of an individual manuscript.

As a signatory of Publish Your Reviews, I have committed to publish my peer reviews alongside the preprint version of an article. For more information, see http://publishyourreviews.org.

This paper presents a large-scale analysis of the content of peer review reports, focusing on different types of comments provided in review reports and the association with the impact factors of journals. The scale of the analysis is impressive. Studies of the content of such a large number of review reports are exceptional. I enjoyed reading the paper, even though I did not find the results presented in the paper to be particularly surprising.

 

Feedback and suggestions for improvements are provided below.

 

The methods used by the authors would benefit from a significantly more detailed explanation:

“Scholars can submit their reviews for other journals by either forwarding the review confirmation emails from the journals to Publons or by sending a screenshot of the review from the peer review submission system.”: This sentence is unclear. Review confirmation emails often do not include the review itself, only a brief ‘thank you’ message, so it is not clear to me how a review can be obtained from such a confirmation email. I also do not understand how a review can be obtained from a screenshot. A screenshot may show only part of the review, not the entire review, and there would be a significant technical challenge in converting the screenshot, which is an image, to machine-readable text.

I would like to know whether all reviews are in English or whether there are also reviews in other languages.

Impact factors change over time. New impact factors are calculated each year. The authors need to explain which impact factors they used.

There are many journals that do not have an impact factor. The authors need to explain how these journals were handled.

The authors also need to discuss how reviewers were linked to publication profiles. This is a non-trivial step that needs to be taken to determine the number of publications of a reviewer and the start and end year of the publications of a reviewer. The authors do not explain how this step was taken in their analysis. It is important to provide this information.

“We used a Naïve Bayes algorithm to train the classifier and predict the absence or presence of the eight characteristics in each sentence of the peer review report.”: The machine learning approach used by the authors is explained in just one sentence. A more elaborate explanation is needed. There are lots of machine learning approaches. The authors need to explain why they use Naïve Bayes. They also need to briefly discuss how Naïve Bayes performs the classification task.

Likewise, I would like to see a proper discussion of the statistical model used by the authors. The authors informally explain their statistical approach. I would find it helpful to see a more formal description (in mathematical notation) of the statistical model used by the authors.

 

“Most distributions were skewed right, with a peak at 0% showing the number of reviews that did not address the content category (Fig 1).”: I do not understand how the peaks at 0% can be explained. Could this be due to problems in the data (e.g., missing or empty review reports)? The authors need to explain this.

 

“the prevalence of content related to thoroughness and helpfulness varied widely even between journals with similar journal impact factor”: I am not sure whether the word ‘between’ is correct in this sentence. My understanding is that the authors did not distinguish between variation between journals and variation within journals.

 

“Some journals now publish peer reviews and authors' responses with the articles”: Consider citing the following paper: https://doi.org/10.1007/s11192-020-03488-4. I also recently published a blog post on this topic: https://www.leidenmadtrics.nl/articles/the-growth-of-open-peer-review.

 

“Bibliographic databases have also started to publish reviews.”: In addition to Web of Science, I think the work done by Europe PMC needs to be acknowledged as well. See for instance this poster presented at the recent OASPA conference: https://oaspa.org/wp-content/uploads/2022/09/Melissa-Harrison_COASP-2022-poster_V2.pdf.

 

“peer review in journals with higher impact factors tends to be more thorough in addressing study methods but less helpful in suggesting solutions or providing examples”: I wonder whether this conclusion is justified. Relatively speaking sentences in reviews for higher impact factor journals are indeed more likely to address methods and less likely to suggest solutions or to provide examples. However, as shown by the authors, reviews for higher impact factor journals tend to be substantially longer than reviews for lower impact factor journals. Therefore it seems that the total number of sentences (as opposed to the proportion of sentences) suggesting solutions or providing examples may be higher in reviews for higher impact factor journals than in reviews for lower impact factor journals. If that is indeed the case, it seems to me the conclusion should be that peer review in higher impact factor journals is both more thorough and more helpful.

 

Finally, I think it needs to be acknowledged that quality assurance processes of journals consist not only of the work done by peer reviewers but also of the work done the editorial staff of journals. This seems important in particular for more prestigious journals, which presumably make more significant investments in editorial quality assurance processes. The results presented in the paper offer valuable insights into peer review processes, but they provide only a partial picture of the overall quality assurance processes of journals.

 

Competing interests:

  • I am Editor-in-Chief of Quantitative Science Studies. Reviews of articles published in Quantitative Science Studies are made openly available in Web of Science (formerly Publons).

  • I am collaborating with Clarivate. Two authors of the paper under review are affiliated with Clarivate.

  • In the Research on Research Institute (RoRI), I am collaborating with the Swiss National Science Foundation (SNSF). Two authors of the paper under review are affiliated with SNSF.

Comments
0
comment
No comments here
Why not start the discussion?