Skip to main content

Review of "A survey of how biology researchers assess credibility when serving on grant and hiring committees"

Published onJun 01, 2024
Review of "A survey of how biology researchers assess credibility when serving on grant and hiring committees"
key-enterThis Pub is a Review of
A survey of how biology researchers assess credibility when serving on grant and hiring committees

Researchers who serve on grant review and hiring committees have to make decisions about the intrinsic value of research in short periods of time, and research impact metrics such Journal Impact Factor (JIF) exert undue influence on these decisions. Initiatives such as the Coalition for Advancing Research Assessment (CoARA) and the Declaration on Research Assessment (DORA) emphasize responsible use of quantitative metrics and avoidance of journal-based impact metrics for research assessment. Further, our previous qualitative research suggested that assessing credibility, or trustworthiness, of research is important to researchers not only when they seek to inform their own research but also in the context of research assessment committees. To confirm our findings from previous interviews in quantitative terms, we surveyed 485 biology researchers who have served on committees for grant review or hiring and promotion decisions, to understand how they assess the credibility of research outputs in these contexts. We found that concepts like credibility, trustworthiness, quality and impact lack consistent definitions and interpretations by researchers, which had already been observed in our interviews. We also found that assessment of credibility is very important to most (81%) of researchers serving in these committees but fewer than half of respondents are satisfied with their ability to assess credibility. A substantial proportion of respondents (57% of respondents) report using journal reputation and JIF to assess credibility – proxies that research assessment reformers consider inappropriate to assess credibility because they don’t rely on intrinsic characteristics of the research. This gap between importance of an assessment and satisfaction in the ability to conduct it was reflected in multiple aspects of credibility we tested and it was greatest for researchers seeking to assess the integrity of research (such as identifying signs of fabrication, falsification, or plagiarism), and the suitability and completeness of research methods. Non-traditional research outputs associated with Open Science practices – research data, code, protocol and preprints sharing – are particularly hard for researchers to assess, despite the potential of Open Science practices to signal trustworthiness. Our results suggest opportunities to develop better guidance and better signals to support the evaluation of research credibility and trustworthiness – and ultimately support research assessment reform, away from the use of inappropriate proxies for impact and towards assessing the intrinsic characteristics and values researchers see as important.

As a signatory of Publish Your Reviews, I have committed to publish my peer reviews alongside the preprint version of an article. For more information, see

This paper presents the results of a survey studying how biology researchers assess credibility when they serve on grant and hiring committees. The paper is well written and the research is well done. I enjoyed reading the paper. I have a few minor comments.

Table 1 distinguishes between appropriate and inappropriate proxies for assessing the credibility of research. I am not convinced by the way the authors make this distinction. I think there is a need for more nuance. (Unfortunately, the research assessment reform movement sometimes fails to be sufficiently nuanced in its criticism on traditional assessment practices.)

While overreliance on journal reputation causes lots of problems, this doesn’t mean journal reputation is always an inappropriate proxy for assessing credibility. If a journal is known to systematically perform rigorous peer review, the use of this information to assess the credibility of an article in the journal makes sense. I don’t consider this to be bad practice. Likewise, while overreliance on journal impact factors is a big problem in some assessment systems, it is not clear whether the use of journal impact factors should always be rejected (e.g., see the argument I presented in

Conversely, the authors consider ‘Confirm output is peer reviewed’ to be an appropriate signal, but I would argue this signal is actually less informative than the reputation of a journal, because reputation takes into account not only whether a journal performs peer review, but also the level of rigor of the peer review process. So if reputation is considered to be an inappropriate signal, then ‘Confirm output is peer reviewed’ should also be considered inappropriate.

In the data analysis section, the authors explain that “comparisons of segment response proportions were conducted using the expss package in R using a z-test with Bonferroni correction, with a significance level set at p<0.05”. It is not clear to me which comparisons the authors are referring to. I cannot find these comparisons in the results presented in the paper. In general, my advice would be to use confidence intervals instead of significance tests.

In the section ‘How researchers define credibility’, I would be interested to see a table showing all correlation coefficients. Also, I find the p-values reported by the authors unhelpful, since they test the hypothesis that the importance ratings are uncorrelated, and this hypothesis is rather unrealistic. If the authors wish to present inferential statistics, I would recommend reporting confidence intervals.

No comments here
Why not start the discussion?