Review of peer review: June 2023

Executive summary

This report presents the findings of a study commissioned by UK Research and Innovation (UKRI) to review interventions to the peer review processes used in research and innovation (R&I) award funding. It is intended as a resource for R&I funders across the globe looking to optimise and innovate in their award-making processes.

The study assessed 38 interventions, which range from small process ‘tweaks’ such as increasing or decreasing the number of reviewers per application and shortening application sections, to more fundamental changes such as partial randomisation and complete bypass of peer review.

The aim of the study was to assess these 38 interventions, to establish what each of them might be useful for, and what disadvantages or hazards each might entail. We also provide an assessment of the overall strength of evidence on each intervention, including which ones are well studied and which ones are not.

Our research is underpinned by an extensive literature review (encompassing both academic and ‘grey’ literature), a survey of UKRI staff, and a programme of 22 interviews with representatives of UK and international research funders and a range of other stakeholders and experts in the field.

We find that all the interventions we considered here are typically intended to fulfil at least one (or sometimes several) of the following seven aims:

to save time, including speeding up time-to-grant
to optimise the relevance of applications and funded awards to the aims of the funding scheme
to increase the ability to identify and fund high-risk or high-reward projects (sometimes known as ‘frontier’, ‘transformative’ or ‘breakthrough’ research)
to reduce burden (on applicants, reviewers, panellists or administrators)
to manage application volume (often a subset of reducing burden, but may also occur for other reasons)
to reduce bias and ensure greater inclusion of disadvantaged groups, including along lines of gender, career stage, institution, or any other category
to improve the overall quality of reviews (for instance, by ensuring optimally tailored expertise of reviewers or increased levels of transparency and feedback)

These seven aims correspond well to the known challenges of peer review documented in the ‘science of science’ literature. Almost all the 38 interventions considered in our review provide opportunities to fulfil the above aims.

At the same time, no intervention is a catch-all solution. None pertain to all seven aims, most are useful for certain contexts and less useful (or even problematic) in others, and almost all may entail some form of disadvantage or hazard. Few of these are insurmountable.

Recent work in the UK and beyond to reduce research bureaucracy and improve research culture may help create conditions where many such hazards can be overcome more easily. Modernised IT systems are also a prerequisite for the implementation of many interventions considered here. Often, the disadvantages of one intervention can also be offset by introducing an additional intervention. Not least for this reason, we often identify two or more interventions that are typically used together.

Our study highlights that there is a critical need to coordinate the use of the interventions with the context and aims of each specific funding opportunity in question. Based on our findings, creating bespoke funding processes tailored to the needs and aims of each funding opportunity is a clear ‘direction of travel’ for the future of R&I funding.

We find a mixed picture when it comes to strength of evidence. For some interventions, there is plenty of evidence, including experimental studies and quantified outcomes, for example:

for applicant anonymisation
two-stage application processes
use of non-academic reviewers

However, others appear to be under-researched (for example, group review and moderation panels). We therefore recommend that funders continue to evaluate and monitor any interventions they use, and share findings with other funding organisations.

Our headline recommendation is that process design should always be a constituent part of scheme design. Every funding scheme has specific aims and characteristics, and so the design of the application, review and decision-making process should be considered for each individual funding opportunity.

We encourage funders to make extensive use of the interventions studied here and to vary their assessment processes widely. Some interventions (for example, peer review colleges or automation-assisted reviewer allocation) even have potential to be mainstreamed across funders’ entire portfolios.

We set out our full list of recommendations in the final section of this report.

Introduction

This report presents the findings of a study commissioned by UKRI on the use and effectiveness of interventions in peer review for grant-making processes. The study has been carried out by Technopolis from January to March 2023. The intention of this study is to act as a resource for all R&I funders across the globe.

The term ‘interventions’ is a catch-all word that encompasses the many different organisational and procedural refinements to the baseline application assessment process used by R&I funders across the globe involving external peer review and panel review. We provide a generalised description of this baseline process below. This step-by-step breakdown is not intended as a representation specifically of UKRI processes, but a generic heuristic of how research and innovation award funding decisions are typically made worldwide. Of the multitude of UKRI funding opportunities, those under the umbrella of ‘responsive mode’ funding tend to approximate most closely to this baseline process.

Baseline application assessment process in R&I funding

Step 1: applicant submits their application

Step 2: funder’s admin staff perform eligibility and compliance checks

Step 3: peer review of applications, including:

remote peer review by two to three external experts
panel review, resulting in a ranked list of applications from best to worst

Step 4: formal sign-off by department or organisation leadership

Each step has formalised standards, including:

reviewer selection
co-investigators
eligibility criteria

Peer review is trusted by researchers and research funders across the globe. Notwithstanding numerous advances in assessment techniques and technologies, it remains the primary means of R&I award selection. There is a large literature characterising peer review and exploring its strengths and weaknesses, which is being added to continuously for different domains and different potential solutions. Key issues with peer review include:

it can be burdensome and time-consuming for researchers, reviewers and funders
it tends to produce conservative decisions, avoiding risk and novelty
it struggles to suitably assess and reward interdisciplinary research
it can be biased in favour of established names and institutions, and there is some evidence of gender bias
fine-grained rankings of proposals can be influenced by reviewer choice
it is underused as a developmental tool (for example, investing sufficiently in feedback that has sufficient depth and quality to improve applicants’ future work)

Resulting in large part from these challenges, many funders have introduced various interventions to modify and deviate from the baseline. Some change drivers are ‘proactive’, meaning that they signal funders’ expanded remit or new strategic ambitions. For example, funding research to address societal needs, to fund high-risk or high-reward research, or to fund research at speed to respond to an emergency. There is also a ‘reactive’ side to change drivers and there are problems with traditional R&I funding assessment processes, including the peer review burden and the risk of bias as outlined above.

Interventions covered in this review

Different interventions are intended to respond to different drivers. The key drivers for deviating from a baseline assessment process will vary depending on the aims and objectives of a given funding scheme. For example, interventions aimed at speeding up the assessment process will be important in an emergency response funding scheme, but less so (or not at all) for long-term investments.

Further, interventions may pertain to different parts of the funding process. We distinguish between interventions at the pre-announcement stage, those pertaining to the design of the application itself, the design of the assessment process, and the final decision-making stage. Finally, there are training or feedback interventions underpinning the entire process, so we posit this as an additional category of interventions.

In collaboration with UKRI, we compiled a list of 38 interventions to the baseline research award funding process. This list forms the basis of our review.

The list began with a preliminary list of 29 interventions, which was included by UKRI in the terms of reference for this study. Based on our own experience of evaluating R&I funding schemes across the globe, as well as on studies we recently conducted on peer review processes in general (including for UKRI, Wellcome, Formas (Sweden) and the Global Research Council (GRC)), we added to this list, and also split or combined various interventions from the preliminary list. Further consultation led to the final list of 38 interventions. We kept open the possibility to include additional interventions if we identified any interesting additional ones during our research. We summarise additional interventions in the ‘Additional interventions identified by our review’ section.

List of 38 interventions to baseline peer review process

Pre-announcement

Assessment criteria definition

Adding assessment criteria additional to conventional ones, may involve a tiered system for assessment criteria. For example, essential versus desirable.

Demand management: individuals (1)

Limiting researchers to being a lead investigator only on one project or application at a time.

Demand management: individuals (2)

Having a ‘time out’ period of a year, so that after an unsuccessful application, the applicant is not allowed to apply the following year. Based on previous behaviour and includes an element of quality control.

Demand management: institutions

Limiting the number of applications or resubmissions accepted from a single institution.

Working with underrepresented groups

Providing additional support to groups that are unrepresented in the funder’s portfolio to encourage them to apply and support them as they do, with the view to increasing diversity.

Application-design and parameters

Applicant behaviours

Designing application forms and processes with a view to encouraging positive behaviours among applicants (for example, removing hierarchies of applicants to encourage consortium building and collaboration).

Expression of interest or pre-proposal

A reduced application is submitted in an expression of interest phase (may simply be a short project description and CV) and triage occurs before a subset are invited to submit a full proposal.

Reducing application form length or cutting sections

Shortening application forms (page or word length) to reduce burden. Requiring only project description and not track record, or cutting other sections.

Process design

‘Sandpits’ or matching events

Potential applicants are invited to an event to discuss possibilities and form teams for potential proposals. May involve some application submission on the day.

Two-stage application process

Two ‘rounds’ of peer or panel review are used, sifting out some after the first stage. May involve different parts of the application being reviewed at different stages, or expression of interest or pre-proposal.

We note that the recent UK Research Bureaucracy Review uses the term differently. However, we opt here for a definition that most international funders would recognise in this form.

Applicant anonymisation

Reviewers or panels members or both do not see the identity of the applicant or applicants.

Automation-assisted reviewer allocation

Using algorithms, artificial intelligence (AI) or text recognition to aid allocation of reviewers to applications.

Dragon’s den-style pitch

Applicants are invited to pitch their proposal in front of a panel, and panels have an opportunity to ask questions. This differs from an interview in that no other form of evidence (for example, written proposals or external expert review) is used in the assessment.

Note for non-UK readers: the term ‘Dragon’s Den’ originated from a UK TV show involving pitching of business ideas to investors.

External review only (no panel)

Proposals are only assessed by external reviewers and review scores are simply combined to give the final score.

Group review

The same reviewer comments on multiple proposals.

Changing the number of reviewers

Two to three external reviews of applications is typical for responsive-mode grant funding, but this number may be lowered to one or significantly increased.

Interviews

Lead applicant (or several application team members) may do a presentation (optional) and are then asked questions on their application by panel members, reviewers or funder representatives.

Moderation of reviews

Reviews are processed internally by funding organisation staff and are only passed to the external panel if they are of sufficient quality.

Moderation panel

Assessment panels use external reviews alongside their own expertise to assess the proposal.

Compared to moderation panels, assessment panel members can bring in their own expertise and this approach is mostly part of the baseline process and therefore not considered as an intervention in this study.

Moderation panels do not use their own expertise but can only use the reviews to inform their scores.

Note that to ensure clarity for the widest possible readership, we are using terminology that might not align with UKRI terminology. In UKRI some moderation panels can bring in generic or system expertise.

Panel only (no postal or external review)

Proposals are only assessed by a panel of experts.

Peer allocation

The applicants are also the assessors and review the proposals they are competing against to decide who gets funding.

Programme manager’s discretion

Applications go directly to the programme or scheme manager, who can recommend funding or even decide to fund unilaterally. Usually involves complete bypass of peer and panel review.

Standing panels versus portfolio panels

Standing panels are the same year on year (with some replacement due to retirement from the panel). Portfolio panels are assembled based on the proposals received and therefore will be comprised differently in each round of funding.

Use of international assessors

Having quotas for assessors based in countries other than the funder’s ‘home’ country. May extend to mandating all-international panels, reviewers or both.

Use of metrics

Use of metrics and bibliometrics as part of the evidence base to inform decision making.

Use of non-academic assessors (including, industry, policy and practice, patients, ‘user’ representatives)

Having quotas for non-academic assessors. May extend to all-user panels, reviewers or both. May take the shape of consultation rather than directly making formal funding recommendations.

Virtual panels

Convening panels online rather than in person.

Decision-making

Wildcard

Sometimes also known as ‘golden ticket’ or ‘joker’. Each panel member (or other decision-maker) is able to select one proposal (for example, per opportunity, per year, or similar) to guarantee funding (provided there is no conflict of interest), regardless of panel rankings or other decision-making processes.

Partial randomisation

Successful proposals are chosen at random. In most methodologies, randomisation is only partial. For example, proposals may be scored and sorted into bands, and only those on the border of being funded will be randomised.

Scoring mechanisms

Includes calibration of scores, consensus versus voting, weighting.

Sequential application of criteria (rather than simultaneous application of criteria)

A proposal is scored for one set of criteria, ranked and a cut-off point determined. Then those above the cut-off point are assessed again for another set of criteria to determine the final funded list.

Use of quotas

After ranking, proposals are reviewed to ensure sufficient numbers in certain categories, including quotas related to:

protected characteristics
place
first-time applicants

Training and feedback

Bringing in reviewers from earlier careers and providing mentoring

Panels and reviewers tend to be very experienced researchers or innovators. Those early in their careers could be invited to review or be part of panels with additional training, bringing different perspectives and experiences. Previous funding opportunity award winners may also be brought in as reviewers or panellists.

Embedding equality, diversity and inclusion in assessment

Training or support provided to make assessors aware of their unconscious biases and to encourage them to call each other out during the assessment process.

Expanding or reducing the amount or detail of feedback to unsuccessful applicants

Different levels of feedback may be provided on unsuccessful applications.

Funder representation on review panels

The funder is represented on the panel to guide discussion or provide briefing on programme aims. Their role is beyond a purely administrative function, they may even be in a chair role or similar.

Improving quality of reviews

Through training, retaining good reviewers or recognition. Peer review colleges fit here too.

Open review or rebuttal

Reviews are published or made available to the applicant before funding decisions are taken, so they can be viewed and responded to.

Method note

For each of the 38 interventions, we set out to compile an evidence base to establish the following points:

definitions: what exactly does the intervention involve? Are there relevant differences in how different funders practise the intervention?
why to do it: what is the envisaged benefit of the intervention? What problems or issues is it supposed to solve? What, therefore, might be measures of its success?
why not to do it: does the intervention have any weaknesses, hazards or drawbacks? Are these especially problematic under certain circumstances (for particular scheme types)?
evidence verdict and strength of evidence: is there evidence to show that this intervention has (or has not) worked? What is the strength of the evidence (for example, controlled experiments, light-touch evaluation, anecdotal)?

Our study had three data collection strands, which ran in parallel.

First, we conducted a review of literature on research award funding processes. This included academic literature as well as evidence from evaluations of various funding schemes and wider strategic studies. We conducted keyword searches for each intervention and also added resources known to us prior to the study.

We further conducted a consultation survey of UKRI staff. This survey was primarily intended to obtain views on any of our 38 interventions that may have been trialled in different parts of the organisation, including comments on interventions that worked well and interventions that did not. This information adds additional stakeholder perspectives to the findings obtained from the literature. While the survey cannot be fully representative, we added some survey items that help quantify which interventions appear to be well known or less well known in different parts of UKRI. The survey also identified whether there is particular appetite for certain interventions to be used more.

Finally, we ran a programme of interviews to obtain further viewpoints on the 38 interventions. We included in the programme a small number of follow-up conversations with UKRI survey respondents, as well as several representatives from UK funders other than UKRI, international funders, stakeholders from the UK higher education institution (HEI) landscape, and a selection of academic experts on peer review, its modifications and alternatives.

Method details are presented in the appendices to this report.

Structure of this report

In the next section, we present some aggregate findings and general observations from our research. In the subsequent five main sections we present the evidence on each of the 38 interventions, split by our five intervention domains:

interventions at the pre-announcement stage
interventions pertaining to design of the application itself
the design of the assessment process
the final decision-making stage
training or feedback interventions underpinning the entire process

The findings in these five main sections are aggregate summaries from our literature review (see Appendix A), our survey (see Appendix B) and our interviews (see Appendix C).

For each intervention, we provide a write up explaining its aims, some data highlights (including instances of use) and any known effects, as well as hazards or dangers associated with each intervention. For each intervention, we also provide a simple rating of the evidence strength. This rating relates only to evidence strength, not to intervention effectiveness. It does not reflect whether the intervention works, but the strength of evidence demonstrating its efficacy (or lack thereof, as the case may be):

one star (*): very limited evidence, almost or entirely tentative or speculative
two stars (**): some evidence, for example several anecdotal pieces and perhaps some minor empirical observations
three stars (***): multiple sources of credible evidence, though not necessarily quantifiable conclusions, and not all parts of the intervention have been investigated thoroughly (for example in cases of multiple aims)
four stars (****): multiple sources of credible evidence, including experimental or other empirical measurement or evaluation

Finally, we provide a brief overview of a small number of other minor interventions not included in our initial list of 38, but which were discovered by the research team over the course of the study. The last section of this report provides a summary of findings and our list of recommendations resulting from our research. We note that these recommendations are not specific to UKRI but may be considered by any R&I funder looking to optimise and innovate in their award-making processes.

General observations

The next main section presents findings on each of the 38 interventions and forms the bulk of this report. However, there are some general observations worth noting at the outset.

First, we find that the rationales for the interventions (as expressed in the literature and by consultees) correspond well to the problems of peer review and the ‘baseline’ funding process noted at the outset. Almost all interventions considered here draw their rationale from the following seven (partially related) aims:

to save time, including speeding up time-to-grant, either from a simple efficiency point of view, or in order to be able to respond to emergencies
to optimise the relevance of applications and funded awards to the aims of the funding scheme (for example, thematic or sector relevance, maximum scope for application)
to increase the ability to identify and fund high-risk or high-reward projects (also known as ‘transformative’, ‘radical’, ‘frontier’ or ‘breakthrough’ research); in a sense this is a subset of the previous aim of optimising relevance (if a scheme specifically aims to fund such research) but it relates to the well-documented conservatism of peer review, which is an issue in its own right
to reduce burden on applicants, reviewers, panellists, and administrators; this is generally about efficiencies and minimising the effort and cost needed to carry out the review of applications
to manage application volume; this may to an extent relate to reducing burden more generally, but also relates to discouraging applications that are out of scope or of unsuitably low quality
to reduce bias and ensure greater inclusion of disadvantaged, underrepresented groups or both, including along lines of gender, career stage, institution, or any other category
to improve the overall quality of reviews, which may mean, for instance, to ensure optimally tailored expertise of reviewers, as well as increased levels of transparency and feedback

Almost without exception, every literature source, interviewee and survey respondent cites at least one of the above seven aims when discussing any of our 38 interventions. Some interventions relate only to one of these seven aims, though most are associated with several (often two or three). In the concluding section of this report, we provide a comprehensive overview of how our 38 interventions relate to each of the seven aims.

While this list of intervention aims is foremost intended to help funders decide when and why to introduce each intervention, we note that it may also be a useful tool to secure buy-in from the research community. The issue of buy-in is not covered in detail in any of the sources we consulted, but our research indicates that wider buy-in is a concern that funders occasionally have when contemplating introducing interventions. A clear rationale for introduction which draws on this list of possible aims may contribute towards mitigating such concerns.

This means that there is a critical need to coordinate use of the interventions with the context and aims of the specific funding opportunity in question. Creating bespoke funding processes tailored to the needs and aims of each funding opportunity is a ‘direction of travel’ for the future of research funding.

The seven main aims and the frequency of them among the 38 interventions are:

save time – 9
optimise the relevance of applications – 11
increase the ability to identity and fund high-risk or high-reward projects – 7
reduce burden – 11
manage application volume – 3
reduce bias – 13
improve the overall quality of reviews – 17

Regarding the disadvantages and hazards of each intervention, few are insurmountable. We do not provide a full assessment of how easily each hazard or disadvantage may be overcome, in part because this is often context dependent. Some funders’ IT systems may for instance be readily able to address many noted challenges. Some hazards may be more severe in smaller research systems (be they delineated by country or research field) where conflicts of interest are more likely to occur.

We note hazards and disadvantages where we identify them, but it will most often be dependent on each funder’s context whether they constitute a ‘showstopper’ or whether they can be dealt with. We also note mitigations where these are evident from our research.

We also note that several interventions may complement each other and may often appear together. For instance, wildcard approaches tend to be used in combination with anonymised reviewing in order to minimise scope for cronyism. In such cases, a hazard associated with one intervention is mitigated by adding another intervention.

On the other hand, some interventions might also counteract each other. An intervention may increase the quality of applications but might entail additional burden, lengthen time-to-grant or both. Others may do the opposite. Pairing or combining different interventions does not appear to be a well-researched topic. However, we have identified some common pairings and rationales for pairing certain interventions together (noted where relevant in the following main section of this report).

Additional meta findings

Positives in the peer review must be recognised

Several survey respondents and interviewees felt the need to emphasise the good things about the peer review system (including the ‘baseline’ approach). In light of the overall criticisms, consultees often judge it essential to praise the work that is globally invested in the peer review effort and the benefits it brings.

Funding staff, having regularly observed panels in action, feel it is sometimes unfair to showcase only examples of failure and ignore the positives. This includes the effort invested, the care that reviewers put into the activity and the benefits a good review brings to the research community.

Political sensitivities and acceptance of the interventions in the research community

Some interventions, primarily partial randomisation, but also the use of quotas, demand management, interviews, and sandpits, have sometimes raised concerns in the research community or at the research funders’ boards or oversight institutions. All consulted funders that have introduced partial randomisation report investing extra effort to make a case for the funder’s board or ministries that oversee their operations, regardless of geography. Private funders might be less concerned about external pressures but still have to make a solid internal case.

Some consulted funders and experts pointed out that the acceptance from the oversight bodies and wider society is a more significant concern than the acceptance in the research community itself. This is mainly because researchers are more familiar with the peer review system, its strengths and weaknesses and understand the rationale of the more experimental interventions.

In some cases, the significant scrutiny and risks result in a reluctance to try new things. However, a degree of scepticism is certainly warranted. Demand management, interviews, sandpits and some other interventions do raise concerns about, for instance, equity and the potential favouring of applicants who can access specific meetings and events and have good presentation skills.

Shifting responsibility from funders to research performing organisations

There is some tension between the responsibilities of funders and research performing organisations in addressing the equity and burden in research funding. Some of the interventions may mean less burden for the funders but more for the research administrators at the research performing organisations.

Where demand management is transferred to the institutional level, research performing organisations may effectively carry out the assessment for the best application to put forward. There are also examples of funders removing some requirements (for example, specific sections in applications or monitoring requirements). However, those are still implemented or asked for at the research performing organisations to maintain internal oversight. As a result, nothing changes regarding the burden for the ‘regular’ researcher and for the research system as a whole.

Manuscript peer review

In the academic literature, some interventions are discussed primarily or only in the context of manuscript peer review (including, for journal publication). Literature and our expert consultation show that journals frequently experiment with new interventions and assess the results of the experiments. Examples include efforts to improve the review process through open peer review (making reviews public) and improving the quality and reliability of peer review through training reviewers.

Literature on manuscript peer review shows improvements in review reliability in terms of identification of errors or recommending manuscripts for rejection after the introduction of reviewer training.

Although not without controversy (for example concerns about less critical comments if the review is open), manuscript peer review may provide examples worth considering for research funders, even though it is beyond the scope of this review.

Another example is scientific publishers reacting quickly and introducing rules and guidance to specify the use of large language models (AI algorithms like ChatGPT) in manuscript preparation and review process. Several consulted research funders were concerned about the impact of large language models on research funding processes. Examples of addressing the matter in manuscript peer review might be worth considering.

It must also be noted that grant peer review processes like interviews and panels make it more complicated and difficult to compare to journal peer review. Grant peer review and journal peer review happen at different stages of the research process. Journal peer review looks at completed work, while grant peer review looks at proposed work.

Main findings: interventions prior to a funding opportunity

Assessment criteria definition

Adding assessment criteria additional to conventional ones, may involve a tiered system for assessment criteria for example, essential versus desirable.

Main intended aims

The main aim is to: increase relevance of funded project to funding opportunity aims.

Main hazards

The hazards include that:

reviewers may not follow guidance
too many criteria risk overcomplicating discussions

Evidence strength

Three star

Findings

This intervention may include:

clear guidance with definitions of criteria
non-biased language (for example, gender) and weighting of criteria
ensuring criteria are suitably discussed and applied during panel meetings

The aim is to make sure that proposals are assessed according to the intended criteria, and therefore to the aims of the funding opportunity. Emphasis is on increasing transparency, consistency, simplification, as well as the need to ensure that the selection reflects the objectives of the specific funding scheme (especially when including new criteria that might be undervalued).

Impact evaluation of one scheme shows the effectiveness of the intervention in supporting projects aimed at achieving non-academic impact. In a small number of cases, consultees were hesitant about publicly sharing certain examples, this is one such case. However, it cannot be attributed solely to the criteria. Funders have observed that this approach meant they funded projects that went on to have impact, and that these would not have been funded if the assessment was purely based on an assessment of research quality.

Several authors appear to agree that more explicit criteria are desirable to avoid bias and inconsistency. However, the evidence also highlights a perception that criteria that go beyond research excellence can still be challenging for reviewers, panellists, or both, to apply. There is also a limit to the number and complexity of criteria that panels can handle.

Further, behaviour of reviewers does not necessarily conform to guidance. Evidence suggests that external reviewers pay more attention to written guidance than panel members.

References

Peer Review of Grant Applications: Criteria Used and Qualitative Study of Reviewer Practices. Abdoul H, Perrey C, Amiel P, Tubach F, Gottot S, Durand-Zaleski I, and Alberti C.

Criteria for assessing grant applications: a systematic review. Hug SE, and Aeschbach M.

Do peers share the same criteria for assessing grant applications? Hug SE, and Ochsner M.

The decision-making constraints and processes of grant peer review, and their effects on the review outcome. Langfeldt L.

Study on the proposal evaluation system for the EU R&I framework programme. Rodriguez-Rincon D, Feijao C, Stevenson C, Evans H, Sinclair A, Thomson S, and Guthrie S.

Assessment of potential bias in research grant peer review in Canada. Tamblyn R, Girard N, Qian CJ and Hanley J.

Interviewees and survey responses

Two survey responses

Two interviews

Demand management: individuals (1)

Limiting researchers to being a lead investigator only on one project or application at a time.

Main intended aims

The main aim is to reduce application numbers and concentration of awards.

Main hazards

The main hazards are that:

it shifts burden to other funders
savings are minimal

Evidence strength

One star

Findings

Many funders limit applicants to one application per funding opportunity. However, this may be expanded to one application across the funders’ entire portfolio. This intervention is intended to reduce the number of applications (by limiting or excluding the participation of current awardees). There may also be a motivation to limit the concentration of awards to a small number of continuously successful researchers.

This intervention is rare, not least as it requires a comprehensive research information system (preferably covering multiple funders so that applications cannot be resubmitted to other funders instead). There is ongoing use at the Swedish Research Council, though no assessments or feedback could be identified.

Our research also highlights sceptical views around this intervention. The Royal Society has stated that it does not support disincentives to apply for funding in the first place (though this statement is from 2007). In addition, a Rand report concluded that savings gained from individual targeting restrictions were marginal if proposals became complex as a result, and recommended institutional quotas for more substantive savings. As noted above, resubmission to other funders is a risk, so burden and application-influx is shifted rather than lessened.

Most UK-based evidence we find is from around 2006 to 2007 after the publication of a peer review report by Research Councils UK (RCUK, UKRI’s precursor). It does not appear to be a heavily studied intervention or one for which there is much ‘appetite’.

References

The effects of funding policies on academic research. Grove L.

Evaluating Grant Peer Review in the Health Sciences: A review of the literature. Ismail S, Farrands A, and Wooding S.

Report of the Research Councils UK Efficiency and
Effectiveness of Peer Review Project. Research Councils UK.

Response to the RCUK consultation on the Efficiency and Effectiveness of Peer Review. Royal Society.

Several grants simultaneously. Swedish Research Council.

What requirements apply if I already have a grant from the Swedish Research Council. Swedish Research Council.

Interviewees and survey responses

No survey responses or interviews

Demand management: individuals (2)

Having a ‘time out’ period of a year, so that after an unsuccessful application, the applicant is not allowed to apply the following year. Based on previous behaviour and includes an element of quality control

Main intended aims

The main aims are to:

limit application volume
increase success rates
save burden

Main hazards

The main hazards are that:

it may simply shift resubmission to other funders
it may not be well received by applicant community

Evidence strength

Two star

Findings

This intervention aims to control application-based demand (for example, application volume and overall success rates) and reduce the workload for funders and reviewers.

The Engineering and Physical Sciences Research Council (EPSRC) already operates a variant of this intervention. For the EPSRC variation any investigator that is repeatedly unsuccessful in the preceding two years will be written to and limited to just one application in the following 12 months.

This approach has received positive feedback from reviewers and senior university personnel. Paired with the approach to ban identical resubmissions, it has been found to reduce application volumes.

Some comments note that researchers suffering from biases may be put in an increasingly disadvantaged position. They also note that this approach may damage individuals’ confidence, experience, career, and wellbeing, though this has not been studied (logically though, it appears plausible that this approach would penalise at least some potential applicants).

In the absence of comprehensive international research information systems, it is also impossible to control for researchers resubmitting applications to other funders. While this approach therefore limits burden for the funder in question, it is unlikely to lead to burden reduction in the wider research system.

Our research indicates that introducing this intervention has occasionally been controversial. Generally, it appears to divide researchers (especially early career researchers) on the one hand, and funders or senior personnel on the other, who typically view the approach fairly positively. However, this assessment is based on various pieces of anecdotal evidence only.

References

Science funding: Duel to the death. Bhattacharya A.

Sham Peer Review: the Destruction of Medical Careers (PDF, 37KB). Huntoon L R.

What works for peer review and decision-making in research funding: a realist synthesis. Recio-Saucedo A, Crane K, Meadmore K, Fackrell K, Church H, Fraser S, and Blatch-Jones A.

Tough love. Nature

Interviewees and survey responses

Two survey responses

Four interviews

Demand management: institutions

Limiting the number of applications or resubmissions accepted from a single institution.

Main intended aims

The main aims are to:

limit application volume
increase success rates
save burden in the funder and reviewer community

Main hazards

The main hazards are that:

it largely shifts burden to institutions
potential additional bias, depending on institutional processes

Evidence strength

Four star

Findings

This intervention aims to reduce workload and administrative burden for the funder and reviewer community. Key indicators would be the number of applications received (lower than without this intervention) and a higher application success rate.

This intervention is known to accomplish what it intends. Variations are in continued use in multiple funding organisations in US and Europe. Examples include:

the Natural Environment Research Council (NERC) limits the number of applications per institution where the HEI in question has failed to meet a 20% success rate over the six most recent grant rounds
the European Society for Paediatric Research allows an unlimited number of applications, but only awards one per institution
National Institutes of Health (NIH) allows two applications per institution in the Director’s Early Independence Awards
the Economic and Social Research Council (ESRC) allows a limited number of applications per institution for its research centres competition (alongside the use of outline proposals)
the US National Science Foundation (NSF) allows three expressions of interest (EoIs) per institution, of which a maximum of one can result in an invite to submit a full proposal

The National Natural Science Foundation of China (NSFC) has a different version of demand management. The 2011 evaluation of the NSFC noted that applicants submit their proposals via their host institution and they may not submit them directly to NSFC. An applicant may not apply more than once per year to any single NSFC programme or hold more than three NSFC grants at the same time (this example is also relevant to the previous section).

This approach is known to have reduced the number of applications in cases of schemes that had previous funding rounds without the intervention.

A major problem with this intervention consistently mentioned throughout the evidence is that it largely shifts selection burden from the funder to the institution. The institution may opt for a more limited reviewing procedure, thus still reducing overall burden to some extent. However, there is also some anecdotal evidence that institutions may be less experienced in some aspects of selection processes, leading to suboptimal outcomes.

This intervention is not passionately debated in one way or the other. Most sources agree on the strengths and hazards of this approach, however there are naturally conflicting interests between funders and research performing organisations on this. As a practice, it is in use in multiple contexts, though it appears to be quite commonplace in the US especially, particularly in funding opportunities aimed at early career researchers. We also note this approach has often been found paired with use of ‘expressions of interest’.

References

International Evaluation of the Funding and Management of the National Natural Science Foundation of China. Arnold E, Lu Y, and Xue L.

ESPR Research Grant Programme 2017 to 2022. ESPR.

ESRC – Large Grants Competition 2016/17 (Update). University of Lincoln: Research Blog. Mycroft C.

Demand management. NERC.

Towards inclusive funding practices for early career researchers. de Winde CM, Sarabipour S, Carignano H, Davla S, Eccles D, Hainer SJ, Haidar M, Ilangovan V, Jadavji NM, Kritsiligkou P, Lee T-Y, and Ólafsdóttir H F.

Interviewees and survey responses

One survey response

No interviews

Working with underrepresented groups

Providing additional support to groups that are unrepresented in the funder’s portfolio to encourage them to apply and support them as they do, with the view to increasing diversity.

Main intended aims

The main aim is to increase diversity of applicants and award winners.

Main hazards

The main hazards are:

may take some time to show effect
may entail administrative burden

Evidence strength

Four star

Findings

This intervention intends to increase the number of applicants (and their success rate when applying for an award) of underrepresented groups, for example, ethnic minority groups or younger or early career researchers.

The Arts and Humanities Research Council’s (AHRC) 2020 to 2022 Equality, Diversity and Inclusion Engagement Fellowship (EDIEF) pilot specifically targeted arts and humanities researchers whose work has a significant equality, diversity and inclusion (EDI) dimension. The funding opportunity sought to enable researchers to engage a variety of relevant stakeholders with their research, to embed their work into policy and practice, and to work with relevant communities to realise the full potential benefits of their research.

The intervention emerged as a response to previous studies identifying barriers to collaborative research partnerships with the minority ethnic communities (common cause research) and a commitment to improving EDI. An evaluation of the pilot showed that 28% of applicants were Asian, Black or mixed ethnicity, while only 9% of applicants to the standard research grant scheme were from an ethnic minority.

The UK Equality Challenge Unit (ECU) launched the Athena SWAN charter in 2005 to recognise universities’ work to improve gender equality and diversity of women in science, technology, engineering, medicine, and mathematics. As a voluntary action, universities are not set with goals, but are instead encouraged to assess their current gender gaps and adopt measures to reduce disparities.

The Athena SWAN Charter offers different levels of accreditation (bronze, silver and gold) to universities depending on the type of interventions and strategies adopted to alleviate gender gaps. Universities need to gain Athena SWAN Charter membership first to apply for accreditation. For the bronze accreditation, universities undertake an assessment of gender disparities and propose a five-year plan to address this. Silver recognition requires the implementation of specific actions, while gold is awarded to those achieving or improving gender parity levels.

In 2011, the National Institute for Health Research (NIHR) linked its research funding for biomedical research centres to actions towards gender equality through the Athena SWAN charter. In 2016, academic institutions had to hold at least silver accreditation to be shortlisted for funding. This intervention led to an increase in women theme leads from 8% in 2006 to 24% in 2016. It may also have contributed to the increase in the number of universities in the field implementing action plans from one in 2011 to 69 in 2016. According to the literature, this intervention has been replicated by funders and science organisations in Ireland, Australia, the US and Canada.

A final example found in the literature is the National Research Mentoring Network delivered by the NIH (US) as part of the ‘Diversity Program Consortium’. It is reported that intensive and sustained training of early career researchers of underrepresented minority groups can help participants to achieve the benchmarks of proposal submission and funding. This mentoring can also have an impact on other areas, such as teaching.

Our interviewees also note that using positive language to encourage women’s participation has led to an increase in the number of applications from women.

Action to improve not just the application but the success rate of underrepresented groups is a rather broader issue with many possible techniques. Most notably we come back to this issue when we address anonymised reviewing, as well as various training interventions covered in the latter parts of this report.

However, on a final point here it is worth mentioning efforts to diversify reviewers. In its 2022 Race and Ethnicity Inequity report, EPSRC noted its action to increase the representation of ethnic minority researchers on its peer review college to 20%. This would be done by actively encouraging self-nominations to the peer review college from all researchers but particularly seeking nominations from minority ethnic researchers. In the first six months of the campaign, EPSRC observed a positive response with a 2.5 times increase in self-nominations compared to the previous year.

There is general agreement on the benefits and effectiveness of the intervention. However, while not noted explicitly in the literature, these actions are likely to take time, which can limit their applicability. There is also an associated administrative burden, which, according to some sources, may be disproportionately carried out by women.

We note as a general comment on the evidence that historically, much of the literature around this intervention has focused mostly on gender. However, among the more recent sources we have considered here, race or ethnicity and age also feature quite strongly.

References

The Equality, Diversity and Inclusion Engagement Fellowship Pilot AHRC funding scheme report 2020 to 2022. Blackburn, M., Coutinho K, and Suviste H.

Ethnicity and race inequity in our portfolio: findings of our community engagement and actions for change. EPSRC.

Gender equality and positive action: evidence from UK universities. Gamage DDK, and Sevilla A.

Positive action towards gender equality? Evidence from the Athena SWAN Charter in UK Medical Schools. Gregory-Smith I.

Effect of Athena SWAN funding incentives on women’s research leadership. Ovseiko PV, Taylor M, Gilligan RE, Birks J, Elhussein L, Rogers M, Tesanovic S, Hernandez J, Wells G, Greenhalgh T and Buchan AM.

Study on the proposal evaluation system for the EU R&I framework programme. Rodriguez-Rincon D, Feijao C, Stevenson C, Evans H, Sinclair A, Thomson S and Guthrie S.

The leaky pipeline in research grant peer review and funding decisions: challenges and future directions. Sato S, Gygax PM and Randall J.

Grant application outcomes for biomedical researchers who participated in the National Research Mentoring Network’s Grant Writing Coaching Programs. Weber-Main AM, McGee R, Eide Boman K, Hemming J, Hall M and Unold T.

Interviewees and survey responses

No survey responses

Two interviews

Main findings: interventions in assessment process design

‘Sandpits’ or matching events

Potential applicants are invited to an event to discuss possibilities and form teams for potential proposals. May involve some application submission on the day.

Main intended aims

The main aim is to foster inter- or multidisciplinary research, new collaborations and transformative research

Main hazards

The hazards include:

problems for access, EDI issues
can be partially resolved through remote events

Evidence strength

Four star

Findings

This intervention intends to foster interdisciplinary research and more innovative proposals and solutions to research challenges, particularly when seeking to promote transformative research.

EPSRC has used the Ideas Factory Sandpit for over 10 years with positive outcomes in terms of the establishment of research communities. There is an observable culture change amongst participants embracing creativity and originality and an increase in the capacity of multidisciplinary researchers and their interaction in the UK.

EPSRC has also run sandpits at distance (remote) with positive results. For respondents, this intervention creates opportunities for building new multidisciplinary partnerships and foster blue-skies ideas.

The EPSRC sandpits also include elements of group review, so this example also pertains to the group review section (5.13) of this report. However, we focus on the collaboration-building aspect of the sandpits.

There are however some negative effects from an EDI perspective reported in the literature and from consultees due to the sandpits’ setup. Intensive face-to-face interaction, mostly away from home, with durations of one-to-five days reduces the opportunities of participation of those with caring responsibilities and potentially for those with disabilities or sensory needs.

Remote sandpits offer more flexibility but do not overcome all the limitations identified. EPSRC implemented a number of further mitigation measures including:

inviting and paying for carers to sandpits to enable the applicant to attend
adapting the facilitation style of the sandpit to make it more accessible
embedding more breaks into the sandpit and changing the model of the sandpit to be accessible virtually over a different timescale to ensure a reduction in screen time

There is clear evidence and strong agreement on the positive impact of sandpits or matching events on fostering multidisciplinary research and innovative solutions to research challenges. Sources also converge on limitations and negative effects for EDI, though the mitigation efforts noted above may provide important ways forward.

References

The experimental research funder’s handbook, Research on Research Institute. Bendiscioli S.

Alternatives to peer review in research project funding. Guthrie S, Guerin B, Wu H, Ismail S and Wooding S.

Sandpit methodology: results of a rapid literature search to inform a sandpit exercise for PETRA. Lodge H.

Sandpits can develop cross-disciplinary projects, but funders need to be as open-minded as researchers. Maxwell K, Benneworth P and Siefkes M.

Decision-making approaches used by UK and international health funding organisations for allocating research funds: a survey of current practice. Meadmore K, Fackrell K, Recio-Saucedo A, Bull A, Fraser SDS and Blatch-Jones A.

What works for peer review and decision-making in research funding: a realist synthesis. Research integrity and peer review. Recio-Saucedo A, Crane K, Meadmore K, Fackrell K, Church H, Fraser, S and Blatch-Jones A.

Creativity greenhouse: at-a-distance collaboration and competition over research funding. Schnädelbach H, Sun X, Kefalidou G, Coughlan T, Meese R, Norris J and McAuley D.

Exploring the potential role of community engagement in evaluating clinical and translational science grant proposals. Treem JW, Schneider M, Zender RL and Sorkin DH.

Interviewees and survey responses

Three survey responses

Two interviews

Two-stage application process

Two ‘rounds’ of peer or panel review are used, sifting out some after the first stage. May involve different parts of the application being reviewed at different stages, or a pre-proposal or expression of interest (see above).

Main intended aims

Reduce burden for reviewers, applicants and programme officers, increase relevance of stage-two proposals

Main hazards

Slight danger of reduced levels of feedback.

Evidence strength

Four star

Findings

This intervention is strongly linked to the ‘pre-proposal or expression of interest’ intervention. Often, they may be interchangeable. Stage one may involve a pre-proposal, though it is also possible that the same proposal document will be reviewed at stage one and stage two. In such cases, this intervention is distinct. Review at the first stage is typically conducted by a review panel (often specifically put together for the funding opportunity to reflect the thematic nature of the applications), though in some cases remote reviewers are also used.

The purpose of this intervention is to reduce overall burden of the evaluation process (on applicants, administrators and reviewers). It is also used to sift out applications that do not meet particular requirements (for example, out of scope).

There are verified positive outcomes from this intervention for both funders and applicants. Wellcome has adopted it and become a regular practice in their evaluation process, reducing burden on written review by 50%. NIHR adopted it with successful results including:

increased number of applications
reduced number of applications per reviewer
lower cost per evaluation round (40% reduction)
shorter notification periods to applicants

There is wide agreement among our sources and consultees on the positive effects of this intervention, sifting out applications that do not meet programme requirements. The only noted concerns are about limited feedback to first stage applications, meaning overall less feedback in the research system and consequent less scope for learning.

Some scepticism was voiced during the study of the Horizon 2020 proposal assessment process, where the study team found that the length of the stage two applications did not significantly differ from single-stage applications. Therefore they claimed that at least 65% of stage one applications must be rejected in order for the overall process to reduce the burden rather than increase it.

References

Streamlined research funding using short proposals and accelerated peer review: an observational study. Barnett AG, Herbert DL, Campbell M, Daly N, Roberts JA., Mudge A and Graves N.

NSF tries two-step review, drawing praise – and darts. Mervis J.

Assessing health research grant applications: a retrospective comparative review of a one-stage versus a two-stage application assessment process. Morgan B, Yu LM, Solomon T and Ziebland S.

Exploring the potential role of community engagement in evaluating clinical and translational science grant proposals. Journal of Clinical and Translational Science. Treem JW, Schneider M, Zender RL and Sorkin DH.

Study on the proposal evaluation system for the EU R&I framework programme. Rodriguez-Rincon D, Feijao C, Stevenson C, Evans H, Sinclair A, Thomson S and Guthrie S.

Interviewees and survey responses

Three survey responses

Three interviews

Applicant anonymisation

Reviewers or panels members or both do not see the identity of the applicant or applicants.

Main intended aims

Reduce bias, foster innovative or transformative ideas.

Main hazards

The main hazard is a limited ability to judge feasibility of projects.

Evidence strength

Four star

Findings

This intervention aims to reduce bias (for example, in relation to institution, gender, career stage, etc.), and to focus reviewers’ attention on project idea rather than person to identify and fund more unconventional research.

This intervention is widely used. Among the examples we find are:

the NIH Director’s Transformative Research Award
EPSRC’s New Horizons scheme
ESRC’s Transformative Research Scheme (currently paused)
the New Zealand HRC Explorer Grant

It is also in use at the Austrian Science Fund (FWF), VW Foundation (where it is variously paired with other interventions), and it has been piloted at the Swiss SNSF.

SNSF’s Spark Fund evaluation found that anonymising applications attracts more unconventional research ideas. Evidence from VW foundation also shows increased success rates for women applicants and early career researchers. Similarly, using anonymisation in the FWF’s 1000 Ideas programme attracted a more diverse pool of applicants than other programmes. Anonymisation is generally considered a ‘gold standard’ for reducing bias.

There is however some evidence to suggest that in the absence of personal information to judge the suitability of the applicants, reviewers or panellists sometimes report that they struggle to assess feasibility of projects from this point of view (though ‘feasibility’ as an assessment criterion usually covers aspects besides applicants’ abilities). FWF also reported instances of some jury members saying they knew who the applicant was and why they should be funded. FWF then reminds the jury that this information is irrelevant to this assessment process.

There is also evidence to suggest that not all bias is eliminated through applicant anonymisation. One reviewed study finds that men and women tend to use language differently and reviewers reward some language uses more associated with men. FWF reported that some applications included information about affiliation by accident, for example, referring to the ethics policy of a particular university.

Our research also finds that anonymisation is often coupled with other interventions, and that funders suspect it may be the combination rather than necessarily the intervention by itself that leads to positive outcomes (includes review without panel and partial randomisation). A separate application stage that is not anonymised can help mitigate the issue around judging feasibility (as practiced in ESRC’s Transformative Research scheme).

References

The experimental research funder’s handbook, Research on Research Institute. Bendiscioli S, Firpo T, Bravo-Biosca A, Czibor E, Garfinkel M, Stafford T, Wilsdon J, Buckley Woods H and Balling GV.

Looking across and looking beyond the knowledge frontier: intellectual distance, novelty, and resource allocation in science. Boudreau K, Guinan E, Lakhani K and Riedl C.

What do we know about grant peer review in the health sciences? Guthrie S, Ghiga I and Wooding S.

Is blinded review enough? How gendered outcomes arise even under anonymous evaluation. Kolev J, Murray Y and Fuentes-Medel F.

Conservatism gets funded? A field experiment on the role of negative information in novel project evaluation. Lanei JN, Teplitskiy M, Gray G, Ranu H, Menietti M, Guinan EC and Lakhani KR.

Evaluation of the Spark pilot. Langfeldt L, Ingeborgrud L, Reymert I, Svartefoss SM and Borlaug SB.

Anonymizing peer review for the NIH Director’s Transformative Research Award Applications. Lauer M.

German funder sees early success in grant-by-lottery trial Matthews D.

An experimental test of the effects of redacting grant applicant identifiers on peer review outcomes. Nakamura R, Mann LS, Lindner MD, Braithwaite J, Chen MC, Vancea A, Byrnes N, Durrant V and Reed B.

Study on the proposal evaluation system for the EU R&I framework programme. Rodriguez-Rincon D, Feijao C, Stevenson C, Evans H, Sinclair A, Thomson S and Guthrie S.

Blinding applicants in a first-stage peer-review process of biomedical grants: an observational study, Solans-Domenech M, Guillamón I, Ribera A, Ferreira-González I, Carrion C, Permanyer-Miralda G and Pons JMV.

One additional confidential UKRI document shared for information with the study.

Interviewees and survey responses

One survey response

Four interviews

Automation-assisted reviewer allocation

Using algorithms, AI or text recognition to aid allocation of reviewers to applications.

Main intended aims

The main aims are:

to increase efficiency or reduce burden in reviewer allocation
better matching of applications to reviewers

Main hazards

The main hazards are that:

the technology is not widely tested
some algorithms may have problems

Evidence strength

Three star

Findings

This is an intervention that has become possible with some modern application management systems. It typically involves matching applications’ keywords or other machine-readable details to reviewers who are associated with those keywords (for example, via applications they have reviewed in the past).

We use the term ‘automation-assisted’ rather than just ‘automated’ to denote that a human element still remains in the process at all times. Meaning that whatever the automated system recommends still needs to be checked by funder staff.

The objective of this approach is to increase efficiency in expert allocation, to reduce administrative burden, and enable a higher degree of quality in application reviews due to identifying the most knowledgeable experts on the topics. It may also lead to a decrease in declined review invitations (reviewers declining due to subject matter being outside their expertise). This technology can also be used to identify potential conflicts of interest.

Automation-assisted reviewer allocation is in ongoing use at the Australian Research Council (ARC) with reported satisfaction. We find mentions of previous use at the Canadian Institutes of Health Research but with poorer reception, although a review study suggests this is due to avoidable challenges with implementation.

The Research Council of Norway (RCN) uses an online tool to find experts to assess applications. RCN reports significant time savings and access to a broader pool of reviewers. In addition, we find anecdotes of numerous instances of use in journal peer review with a high level of reviewer satisfaction.

In short, this is a very promising intervention. We find no difficulties at a general level. However, it may be subject to pitfalls simply because it is a relatively new technological approach. For example, if reviewers are identified based on past reviews, then there is a potential challenge around how to integrate new first-time reviewers into the system. It is not an insurmountable challenge, but one that requires consideration.

The approach has been studied (and algorithms have been developed), but implementation so far is somewhat limited. However, we note that we found no high-profile announcements of implementation even where it had occurred, so it is possible that it is used more than it appears. Potential hazards can likely be avoided through sharing of successful algorithms and technical procedures by funders who have had positive results.

References

Assigning evaluators to research grant applications: the case of Slovak Research and Development Agency. Cechlárová K, Fleiner T and Potpinková E.

Innovating in the research funding process: peer review alternatives and adaptations. Guthrie S.

An algorithm for automatic assignment of reviewers to papers. Kalmukov Y.

How research funders ensure the scientific legitimacy of their decisions: investigation in support of the design of formas. Kolarz P, Arnold E, Davé A, Andreasson H and Bryan B.

UKRI Research and Innovation Funding Service (RIFS) visioning work. Kolarz P.

A semi-automatic web based tool for the selection of research projects reviewers. Pupella V, Monteverde ME, Lombardo C, Belardelli F and Giacomini M.

What works for peer review and decision-making in research funding: a realist synthesis, Research Integrity and Peer Review, Recio-Saucedo A.

One additional confidential UKRI document shared for information with the study.

Interviewees and survey responses

No survey responses

One interview

Dragon’s den-style pitch

Main intended aims

The main aims are to:

increase stakeholder involvement
fund novel, transformative ideas

Main hazards

The main hazards are that it:

favours applicants with sharp presenting skills
may present access problems

Evidence strength

One star

Findings

This intervention seeks to provide an innovate way of funding allocation by facilitating stakeholder engagement with the research ideas. Fostering more diverse and transformative research projects has also been noted as an aim here, though ‘transformative’ may here suggest societal transformation rather than transformation of scientific practice itself.

EPSRC has used Dragon’s Den-style events in the Bright IDEAS Award programme. There is no evaluation of the programme but it claims to have funded highly diverse set of applicants and potentially transformative research.

It has also been used by the Hounslow and Richmond Community Healthcare NHS Trust to ensure that some of the most innovative practices are captured and supported. Two pitching panels were carried out with positive effects in terms of mentoring, fostering collaborative work and innovation in the trust. In another case at the National Cancer Research Institute, a Dragon’s Den event was used to facilitate patients’ involvement in epidemiological research. This resulted in positive feedback from participants in terms of their interest in continuing to engage with the research.

Authors emphasise the role of independent facilitators to run the process. In other words, there needs to be sufficient briefing and oversight of the ‘dragons’. More generally, there is a perceived difficulty in that these events will only suit specific types of individuals (good presentation skills, able to access the events, native speakers) and disadvantage others. This is therefore unlikely to be a widely suitable intervention.

References

Dragons’ Den: promoting healthcare research and innovation. Mazhindu D and Gregory S.

Fleshing out the data: when epidemiological researchers engage with patients and carers. Learning lessons from a patient involvement activity. Morris M, Alencar Y, Rachet B, Stephens R and Coleman MP.

Interviewees and survey responses

No survey responses

One interview

External review only (no panel)

Proposals are only assessed by external reviewers and review scores are simply combined to give the final score.

Main intended aims

The main aims are to:

reduce risk-averseness of panels
reduce burden and costs
better match applications to expertise

Main hazards

Reduced layers of risk control, potential lack of transparency

Evidence strength

Two star

Findings

This approach is intended to reduce risk-averseness in panel discussions and also to reduce burden (in this case for panellists rather than reviewers in general). Further, it gives more flexibility in matching reviewers with applications as the choice is not limited to a relatively small number of panellists. The aim being better matching between reviewers’ expertise and applications. This intervention may also potentially cut costs of in-person panels. We find examples of its use at Australia’s NHMRC, Switzerland’s SNSF and at NERC in the UK.

For the NERC example we find no evaluation evidence, which is why this example is not discussed further. We do note that while there is no panel, reviews are moderated by an external moderator who is an expert in the field and who makes a funding recommendation to NERC.

For SNSF’s Sinergia, the evaluation found that original and unconventional research was given better chances by including originality and unconventionality as key review criteria and funding proposals based on aggregated reviewer grades (rather than panel discussions). Omitting panel meetings was also a way of reducing review costs for small grants.

NHMRC’s data (based on reviewers’ declared suitability before peer review) and responses to NHMRC’s panel member survey suggested better matching of reviewers to applications in 2020 than in 2019.

For the NHMRC case, there was a perceived lack of transparency in the initial round (2020), however, this was mitigated by the addition of reviewer comments in 2021, when previously only scores had been released. This intervention appears also to be largely limited to use for small grants. For larger ones, there is a perceived danger due to fewer ‘layers’ of risk control.

References

Peer review for Ideas Grants in 2021. Kelso A.

Interviewees and survey responses

No survey responses

One interview

Group review

The same reviewer comments on multiple proposals.

Main intended aims

The main aims are to:

facilitate consensus building
increase diversity of reviews

Main hazards

The main hazard is group bias.

Evidence strength

One star

Findings

This intervention aims to facilitate consensus and deliver more comprehensive reviews particularly when reviewing manuscripts for academic journals. We find very limited evidence on this intervention.

The Association of American Medical Colleges experimented with this intervention and found that more thorough feedback was provided to researchers. Reviewers changed their initial individual assessments throughout the group review process and reduced time was required to evaluate the papers compared to what reviewers would spend individually.

The sources we find note a risk of group bias and that shared views may consolidate over time.

References

Expanding group peer review: a proposal for medical education scholarship. Dumenco L, Engle D, Goodell K, Nagler A, Ovitsh R and Whicker S.

Communities of practice in peer review: outlining a group review process. Nagler A, Ovitsh R, Dumenco L, Whicker S, Engle D and Goodell K.

Interviewees and survey responses

No survey responses and interviews

Changing the number of reviewers

Two to three external reviews of applications is typical for responsive-mode grant funding, but this number may be lowered to one or significantly increased.

Main intended aims

The main aims are to:

increase numbers, to improve robustness or reliability
decrease numbers, to save time, burden or cost

Main hazards

The main hazards are:

increasing numbers:
- a single bad review can sink an application
- labour intensive
decrease numbers:
- reduced robustness
- potential for greater bias

Evidence strength

Three star

Findings

Increasing the number of reviewers is done to improve quality and reliability and mitigate against random variations in individual reviews and to improve the ability to address additional assessment criteria. On the other hand, reducing the number of reviewers can be done with the aim to reduce cost and burden of reviews.

There is a broad consensus that reliability of decisions increases with the number of reviewers. This has been demonstrated in quantitative studies and confirmed by funder experience. Several studies have found that five reviewers is an optimal upper limit for robustness, but this is based on data from specific types of programmes. For very small grants, a single reviewer is sometimes used (for example, at the German Research Foundation).

In short, setting the number of reviewers balances two objectives:

adding more reviewers to optimise reliability
reducing the number of reviewers to improve resource efficiency

The optimal number will inevitably depend on situation-specific trade-offs between cost and benefit of adding more reviewers and the tolerance for mistakes in specific situations.

There is some disagreement about the appropriateness of using a single reviewer. This is sometimes done for very small grants, but some argue that the minimum should be two reviewers.

Further, inter-rater-reliability (IRR) is the subject of a large volume of technical literature across this and related topics, and beyond the scope of this discussion.

References

The experimental research funder’s handbook. Bendiscioli S, Firpo T, Bravo-biosca A, Czibor E, Garfinkel M, Stafford T and Wilsdon J.

The decision-making constraints and processes of grant peer review, and their effects on the review outcome. Langfeldt L.

Consultation on the development of peer review for NHMRC’s new grant program. Nous Group.

Interviewees and survey responses

No survey responses

One interview

Interviews

Lead applicant (or several application team members) may do a presentation (optional) and are then asked questions on their application by panel members, reviewers or funder representatives.

Main intended aims

The main aim are to:

improve quality of reviews and increase scrutiny
give an opportunity to respond to criticism

Main hazards

The main hazards are that:

that it is resource intensive
bias, disadvantage or both for certain groups

Evidence strength

Two star

Findings

Interviews serve different purposes depending on the scheme. They can serve to demonstrate the applicant’s presentation skills (which may be especially relevant to commercialisation projects), improve engagement with panellists (assuming panellists are the interviewers), allow applicants to respond to comments and defend their proposal, or improve the overall quality of reviews as the interview will provide reviewers with additional context.

When used, interviews often occur at the end of the process: due to their resource-intensive nature, efficiencies can be gained by having interviews as the final stage of a multi-stage assessment process (by which point most applicants will have already been rejected, meaning there are fewer interviews to do). In addition to standard interviews, they can also take the form of a scientific symposium or workshop. The practice is often used for early career fellowships (including strongly person-centred awards) and schemes aiming to fund particularly transformative research.

Funders have found interviews to be a helpful way of assessing proposals or candidates against specific objectives, whereas others use it more widely to improve the quality of the review. It can be difficult to evidence the exact effect of using interviews but one study found that that the interview stage had a significant impact on the final grant selection.

Interviews are typically used in addition to other types of assessment.

They are particularly resource intensive, requiring time and space set aside for each individual applicant. For this reason, Wellcome has for instance decreased its use of interviews in recent years. Assessment through interviews can also be biased against certain personality types (for example, introverted, nervous, non-native speakers). In-person interviews may also pose difficulties for applicants with caring responsibilities or disabilities. However, we note on a final point that there is limited research on the effectiveness of interviews in terms of achieving certain types of funding outcomes, despite their relatively frequent use.

References

Academic talent selection in grant review panels. van Arensbergen P, van der Weijden I and van der Besselaar P.

An outcome evaluation of the National Institutes of Health (NIH) Director’s pioneer award (NDPA) program. Lal B, Wilson A, Jonas S, Lee E, Richards A and Peña V.

Evaluation practices in the selection of ground-breaking research proposals. Luukkonen T, Stampfer M and Strassnig M.

Interviewees and survey responses

No survey responses

Three interviews

Moderation of reviews

Reviews are processed internally by funding organisation staff and are only passed to the external panel if they are of sufficient quality.

Main intended aims

The main aim is to ensure consistency or quality of reviews.

Main hazards

The main hazards are that:

it can be time consuming for administrators
administrators may not have sufficient thematic expertise

Evidence strength

One star

Findings

Moderation of reviews is intended to ensure quality of the reviews received in order not to waste time or have an inconsistent evidence base at later stages of the evaluation process. This particularly applies during panel reviews and for feedback from assessor or reviewers. Moderation might only involve a basic ‘usability check’ (ensuring that reviews are not just one line of text or similar) or more involved engagement to check if the reviews meet a broader set of criteria.

All UKRI councils use some degree of review moderation. For example, Innovate UK introduced a moderation phase to review outlier scores from assessors to ensure consistency, since they were receiving complaints from applicants about conflicting feedback from assessors.

Our research received some anecdotal comments noting on one hand that moderation of reviews does bring benefits in terms of consistent review quality, but that it places a burden on administrators’ time. Additionally, administrators may not always have all the necessary thematic expertise if moderation extends to thematic aspects.

Beyond such anecdotal points, our research found no further evidence on the efficacy or hazards of this intervention. Literature appears to be insufficient and does not distinguish from moderation panel intervention.

References

Study on the proposal evaluation system for the EU R&I framework programme. Rodriguez-Rincon D, Feijao C, Stevenson C, Evans H, Sinclair A, Thomson S and Guthrie S.

Ensuring sustainable evaluation: how to improve quality of evaluating grant proposals? Wieczorkowska G and Kowalczyk K.

Interviewees and survey responses

No survey responses or interviews.

Moderation panel

Assessment panels use external reviews alongside their own expertise to assess the proposal. Moderation panels do not use their own expertise but can only use the reviews to inform their scores.

Main intended aims

The main aims are to ensure consistency, increase expertise and robustness of reviews.

Main hazards

Not known

Evidence strength

One star

Findings

Assessment panels where members can bring in their own expertise are the baseline approach funders use. Our research found no evidence of the effectiveness of using moderation panels. UKRI uses moderation panels in some programmes where assessment panels cannot cover the breadth of expertise required to assess applications from diverse disciplines. However, the effectiveness of the moderation panels is not systematically studied, therefore, evidence strength for this intervention is weak.

References

None

Interviewees and survey responses

No survey responses or interviews

Panel only (no postal or external review)

Proposals are only assessed by a panel of experts.

Main intended aims

The main aims are to:

increase speed of decisions, efficiency, ensure consistency of reviews
include strategic perspectives in reviewing

Main hazards

The main hazards are:

the difficulty to cover the required expertise in a panel
that it may still need additional reviews
potential bias

Evidence strength

Three star

Findings

This intervention is similar to the ‘group review’ intervention, though it involves reviewers actually meeting as a group (review panel), which the ‘group review’ intervention does not. It is used for a variety of reasons:

to speed up funding decisions
to reduce written feedback (and its associated costs and burden)
to improve quality and consistency of feedback to applicants
to assess riskier research proposals and where strategic considerations play a central role in the judgement process (for example, ensuring EDI is properly assessed)

AHRC adopted panel-only assessment for the Equality, Diversity and Inclusion Engagement Fellowship (EDIEF) pilot programme, forming a bespoke panel that embedded EDI in the evaluation process and sped up funding decision making.

The Royal Society has also used it with positive results, funding more high-risk high-reward research proposals, and an increase in the number of individuals willing to participate as panellists. The Royal Society has also found that rigour has remained high, which is also reflected in our survey responses.

Cancer Research UK has implemented panel-only assessment, resulting in a significant reduction of written peer review requests. It highlighted the benefits of having in-person discussions as a more valuable way of evaluating research applications.

Panel-only review was also a technique used by a range of funders in their R&I funding responses to COVID-19, as a mechanism for ensuring that awards were made quickly and could thus respond to the societal emergency at hand. Several examples are detailed in the process review of UKRI’s response to COVID-19 (which contains a review of six international comparators), though long-term evaluations of effectiveness are not yet available.

It is a challenge to represent enough expertise on a panel to cover the potentially broad thematic, subject range or both of a large number of incoming applications. There is therefore typically a need to have large panels and funders may still have to rely on external reviewers when applications fall outside the panel expertise, or in the absence of agreement.

Configuration of panels may be difficult as panellists may need to be recruited from distant subject domains, potentially creating some administrative burden in panel set-up.

There is broad agreement on the effectiveness of this intervention (reducing burden and speeding up decision making) but controversy around its effects regarding bias. Some evidence showed panels purposefully used to embed and ensure EDI throughout the process, while in other programmes it was found that it failed to sufficiently factor this in. Cross-disciplinary panel composition may also result in ‘communication problems’.

We note that this is the most frequently discussed intervention in our UKRI staff survey.

References

Similar-to-me effects in the grant application process: Applicants, panelists, and the likelihood of obtaining funds. Banal-Estañol A, Liu Q, Macho-Stadler I and Pérez-Castrillo D.

Equality, Diversity and Inclusion Engagement Fellowship Pilot AHRC Funding Scheme Report. Blackburn M, Coutinho K and Suviste H.

Designing grant-review panels for better funding decisions: lessons from an empirically calibrated simulation model. Feliciani T, Morreau M, Luo J, Lucas P and Shankar K.

Process review of UKRI’s research and innovation response to COVID-19. Kolarz P, Arnold E, Bryan B, D’hont J, Horvath A, Simmonds P, Varnai P and Vingre A.

Exploring the degree of delegated authority for the peer review of societal impact. Samuel GN and Derrick G.

Interviewees and survey responses

Six survey responses

Two interviews

Peer allocation

The applicants are also the assessors and review the proposals they are competing against to decide who gets funding.

Main intended aims

The main aims are to:

lesson administrative burden
reduce pressure to identify reviewers

Main hazards

The main hazards are that:

it is possibly open to abuse or gaming
it adds to applicant burden

Evidence strength

Three star

Findings

This intervention evens out the number of applicants to reviewers and is intended to lessen the administrative burden on reviewers and shorten the overall time taken to recruit reviewers.

There are a small number of known instances where the intervention is in use, but the results are cautiously positive across the board. In one scheme, seven rounds of review were organised within a year, with a total of 614 reviews carried out by 201 reviewers (some being applicants and some not). When compared, the two groups of scorings correlated. Where successfully in use, it relieves the pressure to identify expert reviewers. It appears to be a successful way to expedite the review process without impacting the integrity of the selection.

The NSF ran an experiment on peer allocation in 2013. As a condition of application, applicants had to commit to assessing seven other proposals submitted to the scheme and then rank the proposals from best to worst. The NSF also employed a mechanism to dissuade reviewers from downgrading a competitor’s proposal in order to boost their own. Reviewers earned bonus points on their own applications if their assessments of other proposals closely matched what their colleagues thought. An article in Science reports that the system saved time and money, but that the need for ‘group consensus’ may disadvantage novel, unconventional ideas.

Peer allocation may risk being abused if:

the consistency of scoring with non-applicant reviewers is not monitored
the approach is mainstreamed

It is possible that this is mainly viable in smaller, perhaps early-career settings. Moreover, peer allocation has the reverse effect on the administrative burden on applicants, particularly if additional training is required.

As a side note, the ESRC Transformative Research scheme had an element of this but in the context of a dragons’ den-style event rather than application review proper (termed ‘pitch-to-peers’). The evaluation found that, contrary to simple self-interest arguments, reviewers were generally supportive of their fellow applicants, resulting in collegial discussions.

References

Community review: a robust and scalable selection system for resource allocation within open science and innovation communities. Graham C L B, Landrain T E, Vjestica A, Masselot C, Lawton E, Blondel L, Haenal L, Greshake Tzovaras B and Santolini M.

Innovating in the research funding process: peer review alternatives and adaptations. Guthrie S.

Evaluation of the ESRC Transformative Research Scheme. Kolarz P, Arnold E, Farla K, Gardham S, Nielsen K, Rosemberg C and Wain M.

A radical change in peer review: a pilot project to ease pressure on NSF’s vaunted peer-review system required grant applicants to review seven competing proposals. Mervis J.

Powering discovery: the expert panel on international practices for funding natural sciences and engineering research. The Council of Canadian Academies.

What works for peer review and decision-making in research funding: a realist synthesis. Recio-Saucedo A, Crane K, Meadmore K, Fackrell K, Church H, Fraser S and Blatch-Jones A.

Interviewees and survey responses

No survey responses or interviews

Programme manager’s discretion

Applications go directly to the programme or scheme manager, who can recommend funding or even decide to fund unilaterally. Usually involves complete bypass of peer and panel review.

Main intended aims

The main aims are to:

shorten time-to-grant
reduce overall burden
respond to emergencies
fund high-risk high-reward projects likely to fail in peer review

Main hazards

The main hazards are that:

there is evidence that it may be underused as programme managers themselves can be risk averse
it lacks transparency, potentially a ‘winners’ game’

Evidence strength

Three star

Findings

This approach is used to support exploratory or high-risk or high-reward projects that might not be selected through potentially more conservative peer review.

This approach has also been used to respond quickly to address urgent issues or grasp immediate opportunities for innovative developments. For example, several funders have on occasion partly or fully bypassed peer review, these include:

NSF
NWO
NRC
French National Research Agency

Several of them relied on programme managers in parts of their COVID-19 response funding and found that this accelerated the funding decisions at a time when research projects had to start as soon as possible.

The approach can also be applied by leaving the final decision to funding staff (including, programme managers) after an initial shortlisting or sifting through a more traditional external review process. Furthermore, even when programme managers are tasked with assessing the applications, there is still usually an option to recruit external expertise if necessary.

This approach has been found to be successful in supporting exploratory research that often led to follow-on funding and significant results further down the line. The approach also encourages dialogue between applicants and staff at the funding organisation.

This approach is particularly common in the US and is associated with the ‘Defense Advanced Research Projects Agency (DARPA)’ approach, considered highly successful, and attracts a lot of attention from businesses. NSF used the approach in the Small Grants for Exploratory Research programme introduced in 1990. Now it is applied in the successor programme, the RAPID instrument, which is used to fund research in response to emergencies.

In one case, actual use of discretionary allocation was found to be much lower than the allowed limit (up to 5% of grant budget).

In opposition to widespread use of this mechanism, it is argued that the selection process lacks transparency, effectively basing decisions on one person’s opinion. It has also been argued that the successful application of the ‘DARPA’ model is a ‘winners game’ potentially benefitting the most well-established and well-connected researchers.

References

Administrative discretion in scientific funding: evidence from a prestigious postdoctoral training program. Ginther DK and Heggeness ML.

Alternatives to peer review in research project funding. Guthrie S, Guerin B, Wu H, Ismail S and Wooding S.

Process review of UKRI’s research and innovation response to COVID-19. Kolarz P, Arnold E, Bryan B, D’hont J, Horvath A, Simmonds P, Varnai P, Vingre A.

Evaluating transformative research programmes: A case study of the NSF Small Grants for Exploratory Research programme. Wagner CS and Alexander J.

Interviewees and survey responses

No survey responses

Two interviews

Standing panels versus portfolio panels

Main intended aims

Standing panels ensure consistency, and may be the site of long-term learning and interdisciplinary conversation

Main hazards

Standing panels may potentially lead to institutionalised bias

Evidence strength

Three star

Findings

Broadly speaking, standing panels ensure greater consistency over time and creation of certain ‘cultures’ and understandings of specific scheme aims. At the same time portfolio (or ‘ad-hoc’) panels can be assembled to better reflect the thematic and disciplinary spread of a specific pool of applications.

The literature highlights that standing panels ensure consistent evaluation. There appears also to be a link with more consistent and comprehensive feedback on applications (particularly important for resubmitted proposals). It also creates opportunities for interdisciplinary conversations between panellists, reviewers and applicants, including over time as a standing panel ‘matures’.

In some cases, standing panels present an opportunity to develop capacity of inquiry of reviewers or staff and for professional development of applicants. They also reduce recruitment burden on programme officers as members of standing panels are normally appointed for several-year periods. For example, the review of the National Institutes of Health National Institute on Disability and Rehabilitation Research funding processes concluded that programme staff managing programmes with standing panels face less burden in peer recruitment.

While various forms of training (for example, EDI training) have a longer ‘effect’ on standing panels, there may in the absence of such training also be more institutionalised bias and narrow perspectives. This therefore needs to be considered when configuring them to offset these potential drawbacks.

The main advantage of the portfolio panels is a fresh view and better ability of peers to assess the specifics of the funding programme concerned, as the peers are selected specifically for the funding opportunity or programme. However, we find no empirical evidence assessing the functioning of portfolio panels and, in the literature, the associated benefits are reported as assumptions about how the portfolio panels would work.

It is worth noting that hybrid-versions are possible, and practised to some degree by many funders. For example, the Human Frontier Science Program (HFSP) uses standing panels where each panellist has a finite tenure. Once a panellist’s tenure expires, secretariat staff may consider any changes over time to the portfolio of applications (evolving themes and new emerging methods or interdisciplinary perspectives) when identifying new panellists.

References

Enhancing NIH grant peer review: a broader perspective. Bonetta L.

Standing Panel on Impact Assessment (SPIA). CGIAR IAES.

Participation and motivations of grant peer reviewers: a comprehensive survey of the biomedical research community. Gallo SA, Thompson LA, Schmaling KB and Glisson SR.

IES guide for grant peer review panel members: information, tips and instructions for members of IES scientific grant peer review panels (PDF, 780KB). Institute for Education Sciences.

Organisational and process review of the Human Frontier Science Program. Kolarz P.

Process review of UKRI’s research and innovation response to COVID-19. Kolarz P.

Becoming a peer reviewer for NIJ (2023). National Institute of Justice.

Improving NIJ’s peer review process: the scientific review panel pilot project. Newton P and Feucht T.

CSR data and evaluations. National Institutes of Health.

Review of disability and rehabilitation research: NIDRR grantmaking processes and products. Rivard J, O’Connell M and Wegman D.

Responsive mode grant assessment process. UKRI

An overview: IES procedures for peer review of grant applications. University of Florida College of Education.

Managing internal nomination and peer review processes to reduce bias. Wigginton N, Johnston J and Chavous T.

Interviewees and survey responses

One survey response

Use of international assessors

Having quotas for assessors based in countries other than the funder’s ‘home’ country. May extend to mandating all-international panels, reviewers or both.

Main intended aims

The main aims are to:

avoid conflicts of interest, ensure required expertise and fill gaps
bring in specific country expertise

Main hazards

The main hazard is that it may require more guidance or training for panellists.

Evidence strength

Two star

Findings

International assessors (reviewers or panellists) are used to ensure required expertise (in-country knowledge, international development knowledge) to fill competence gaps in the funder country. Particularly in smaller countries, there may be minimum quotas for international reviewers, or even mandates to use only international reviewers to avoid conflicts of interest among reviewers. For example, the Austrian Science Fund FWF uses only international reviewers for this reason.

UKRI, Wellcome, the Royal Society and CRUK are among many funders who have used international reviewers extensively. They found it to be effective in diversifying and expanding the pool of reviewers and ensuring review quality, particularly from developing countries.

Funders have also used international assessors to fill gaps and in cases of knowledge or context-specific needs. Several funders are also keen to use this intervention more often because of its effectiveness and benefits. However, some also expressed concerns regarding country differences in the assessment process that may require extra guidance for some panellists. Disparities across different countries’ typical assessment processes may require additional training or guidance for international reviewers.

While the likelihood for conflicts of interest among national reviewers is far more significant in small countries, this issue does also apply to larger countries to some extent. Some consultees for our study also expressed significant support for international reviewers to mitigate for the conflict of interest in areas where there are small numbers of potential reviewers in the UK.

References

Assessment of potential bias in research grant peer review in Canada. Tamblyn R, Girard N, Qian CJ and Hanley J.

Interviewees and survey responses

Three survey responses

Four interviews

Use of metrics

Use of metrics and bibliometrics as part of the evidence base to inform decision making.

Main intended aims

The main aims are to:

provide additional information about applicants
increase robustness of review

Main hazards

The use of metrics is highly controversial and the main hazards are that they are:

a poor measure of excellence
open to bias and abuse, may contravene the San Francisco Declaration on Research Assessment (DORA)

Evidence strength

Three star

Findings

Metrics can be used to support the assessment of funding applications, providing additional information about the applicants’ track record. If used, it is typically early in the assessment process. The most commonly used metrics are reportedly field-normalised citation measures and proportion of publications among the most cited in the field. Some UKRI schemes have also used grant income or Research Excellence Framework (REF) outcome metrics in the past, prior to UKRI becoming signatory to the DORA. While the use of metrics overall is rare, when it does appear it tends to be in programmes funding research in bio-medical fields.

Recent survey evidence shows that bibliometric indicators are viewed as important by some reviewers, particularly in the early stages of the review to assess the candidate, and less important at the panel stage.

The use of metrics is controversial, and many limitations have been identified. First, various objections hold that metrics are a poor way of assessing research excellence and potential. Second, their use may lead to biases (for example, around gender, career stage or research field) and that bibliometric indicators are often used unethically. It cannot be ruled out either that the focus on track record (as demonstrated by bibliometric analysis) can contribute to a vicious circle where those with a shorter track will be rendered with fewer research activities overall, while funding concentrates on established individuals. Survey evidence suggests that reviewers who themselves have good personal bibliometric impact scores are more likely to regard metrics as important.

In short, evidence suggests that some reviewers find bibliometric indicators useful as supporting information, but there are widespread concerns about their use in the research communities.

We note that despite the apparently widespread use and despite general controversy, none of our survey respondents and interviewees commented on this intervention.

References:

Harnessing the Metric Tide: indicators, infrastructures and priorities for UK responsible research assessment. Curry S, Gadd E and Wilsdon J.

How do NIHR peer review panels use bibliometric information to support their decisions? Gunashekar S, Wooding S and Guthrie S.

The role of metrics in peer assessments. Langfeldt L, Reymert I and Aksnes DW.

Segmenting academics: resource targeting of research grants. Viner N, Green R and Powel P.

Dealing with the limits of peer review with innovative approaches to allocating research funding (PDF, 314KB). Bendiscioli S and Garfinkel M.

The leaky pipeline in research grant peer review and funding decision: challenges and future directions. Sato S, Gygax P, Randall J and Schmid Mast M.

Interviewees and survey responses

No survey responses and interviews.

Use of non-academic assessors (including industry, policy and practice, patients, etc.)

Having quotas for non-academic assessors. May extend to all-user panels, reviewers or both. May take the shape of consultation rather than directly making formal funding recommendations.

Main intended aims

The main aim is to increase societal relevance and impact.

Main hazards

The main hazard is that it may dilute notions of basic research, and its not recommended for such contexts.

Evidence strength

Four star

Findings

The inclusion of non-academics is closely related to the increasing priority given to societal use and impact of research. Depending on the context, including non-academic reviewers may aim to:

represent stakeholder concerns (for example, patients)
improve the assessment of relevance and potential impact (for example, using industry reviewers)
improve the assessment of potential interest among users, as well as feasibility of real-world applications (for example, using technicians to support the assessment of applications for research infrastructure)

Our consultation reveals that most health research funders (for example Wellcome, NIH, NIHR, UKRI, MRC) involve patient representatives in at least some of their funding. Funders observe that this helps panel members assess whether the applications consider patient needs. The Science and Technology Facilities Council (STFC) sometimes involves technical professionals and project management experts in the assessment process of applications for new major projects and project technology development funding. STFC finds that this adds valuable information to assess the feasibility of proposed operational costs and other technical details. Relatedly, the involvement of non-academic assessors is seen as a way to overcome a perceived bias against applied research in traditional peer review.

Non-academic users are now widely used in programmes where societal or economic impact are important objectives and recommended in grey literature texts. Practically, this intervention can be implemented in a staggered review process, for example with a traditional academic peer review followed by a more diverse panel with a greater focus on relevance and impact.

Funders report that this helps improve panel discussions, understanding of the context of use (for example, industry), and the quality of the assessment. For example, EPSRC and Norway’s RCN use industry reviewers in programmes that support collaborative research and aim to deliver academic outcomes and also benefit industry partners. Using industry reviewers helps assess applications that cover industry motivation and potential commercial outcomes of the proposed projects.

Feedback from applicants shows that they feel better understood when industry reviewers are involved. Although it is hard to attribute programme success to this process element, impact and process evaluations show that the programmes have succeeded in selecting the right applications that align with programme objectives.

Despite the overall positive verdict, non-academic considerations are not appropriate in all contexts, for example, in the context of pure basic research funding schemes.

Our research also highlights a common view that there is a risk of bias in the selection of industry representatives from large enterprises in specialised roles, rather than from small and medium-sized enterprises. As such, pools of non-academic assessors may not be representative of the wider business population (this may not necessarily be a problem depending on scheme aims, for example, if it only targets certain types of businesses).

The value of non-academic reviewers may also be limited if the objects and types of impact sought by the funding schemes are unspecific and too open ended.

There is evidence that some academic reviewers believe they are sufficiently aware of the wider context in which research is used to assess proposals. Some also perceive the role of industry assessors negatively, potentially blocking worthy applications due to a lack of understanding of the academic context.

Interviewed funders report some difficulties in finding/recruiting non-academic reviewers. There are limited incentives to complete reviews in academia, and there are almost none in the industry or other sectors, which can be challenging. The funders use systems to find academic reviewers that are not always appropriate for finding non-academic reviewers. Therefore recruiting non-academic reviewers may require substantial additional effort for research funders.

There is, in short, some difficulty around this intervention, meaning it is important to consider carefully when to use it. However, as a means of increasing relevance and strengthening the science or society interface it has significant importance.

References

Does the inclusion of non-academic reviewers make any difference for grant impact panels? Luo J, Ma L and Shankar K.

The influence of peer reviewer expertise on the evaluation of research funding applications. Gallo SA, Sullivan JH and Glisson SR.

Alternatives to peer review in research project funding. Guthrie S.

Innovating in the research funding process: peer review alternatives and adaptations. Guthrie S.

The experimental Research funder’s handbook. Bendiscioli S, Firpo T, Bravo-biosca A, Czibor E, Garfinkel M, Stafford T and Wilsdon J.

UKRI Research and Innovation Funding Service (RIFS) visioning work (PDF, 1.5MB). Kolarz P, Bryan B, Farla K, Krčál A, Potau X and Simmonds P.

How research funders ensure the scientific legitimacy of their decisions: investigation in support of the design of Formas scientific management. Kolarz P, Arnold E, Davé A and Andréasson H.

Evaluation of research proposals by peer review panels: broader panels for broader assessments? Abma-Schouten R, Gijbels J, Reijmerink W and Meijer I.

Peer review of health research funding proposals: A systematic map and systematic review of innovations for effectiveness and efficiency. Shepherd J, Frampton GK, Pickett K and Wyatt JC.

Interviewees and survey responses

Three survey responses

Five interviews

Virtual panels

Convening panels online rather than in person.

Main intended aims

The main aims are to:

save costs and carbon footprint
ensure more international panellists
generally remove barriers to participation

Main hazards

The main hazards is that there is potentially less robust or detailed discussion, though this is unclear.

Evidence strength

Three star

Findings

Online panels saw drastically increased use during the COVID-19 pandemic to overcome travel restrictions and lockdowns. More broadly, online panels can help secure participation of international panel members. At a general level, online panels aim to reduce the costs and environmental impact of international (and even national) panellists travelling. Panellists with caring responsibilities or any other travel limitations are also usually more able to participate in virtual panels.

Online panels have been widely adopted by CRUK since the pandemic, and this has resulted in increased participation of international assessors. Other examples report cost reductions and greater diversity of panels. The NSF for example experimented with virtual panels in 2010, and an article in Science reports cost savings of $10,000 per panel.

There is a perceived need with virtual panel meetings to provide especially clear briefing beforehand. Some consultees see a risk of lower engagement and therefore shorter discussions compared to face-to-face panels, which can also be seen as positive in some cases with a very focused discussion.

Although the use of this intervention has increased very recently, there is a lot of positive feedback and agreement around its effectiveness. However, virtual panels saw a substantial increase in use during the COVID-19 pandemic, bringing them to the attention of many more stakeholders.

Much of our literature does not cover these recent experiences and it is possible that this intervention has more detractors than the pre-pandemic literature suggests. We note as one example the Irish Research Council’s Laureate Award scheme, which shifted its panel meetings online during the pandemic and was reviewed shortly thereafter. The report surveyed panellists on the experience and whether online panels should be mainstreamed in future. While the mean opinion is reported to be in the range of ‘neutral’ to ‘somewhat in favour’, the review notes a broad range of different sentiments, indicating the need for more research and consultation on the matter.

References

Face-to-face panel meetings versus remote evaluation of fellowship applications: simulation study at the Swiss National Science Foundation. Bieri M, Roser K, Heyard R and Egger M.

Meeting for peer review at a resort that’s virtually free. Bohannon J.

Teleconference versus face-to-face scientific peer review of grant application: effects on review outcomes. Gallo SA, Carpenter AS and Glisson SR.

Alternatives to Peer Review in Research Project Funding. Guthrie S, Guerin B, Wu H, Ismail S and Wooding S.

What do we know about grant peer review in the health sciences? Guthrie S, Ghiga I and Wooding S.

Innovating in the research funding process: peer review alternatives and adaptations. Guthrie S.

Pre-award process review of the IRC Laureate Award (PDF, 2.4MB). Kolarz P, Arnold E, Cimatti R, Dobson C and Seth V.

What works for peer review and decision: making in research funding: a realist synthesis, Research Integrity and Peer Review. Recio-Saucedo A, Crane K, Meadmore K, Fackrell K, Church H, Fraser S and Blatch-Jones A.

Peer review of health research funding proposals: A systematic map and systematic review of innovations for effectiveness and efficiency. Shepherd J, Frampton GK, Pickett K and Wyatt JC.

Interviewees and survey responses

One survey response

Three interviews

Main findings: interventions to the shape of decision-making

Wildcard

Sometimes also known as ‘Golden ticket’ or ‘Joker’. Each panel member (or other decision maker) is able to select one proposal (for example, per opportunity, per year, or similar) to guarantee funding (provided there is no conflict of interest), regardless of panel rankings or other decision-making processes.

Main intended aims

The main aims are to:

fund riskier, transformative ideas
save debating time in panels

Main hazards

The main hazard is that it is open to abuse if conflicts of interest are not monitored very well. This intervention requires anonymised reviewing.

Evidence strength

Three star

Findings

This is an intervention aimed mostly at increased funding for new and riskier ideas. The underlying assumption is that panels tend towards conservatism supported by the finding that often a single poor review may mean an application is rejected. This intervention provides a way of circumventing this type of ‘group think’.

A secondary aim of this intervention is to reach funding decisions more rapidly. Especially controversial applications (applications with some very positive and some very negative reviews) tend to take up a considerable time in panel meetings. A ‘wildcard’ option means occasionally ending long discussions where agreements seemingly cannot be reached.

There are three known instances of implementation (at Volkswagen (VW) Foundation, FWF and Villum Foundation). At Volkswagen Foundation and Villum Foundation outcomes were generally as hoped. Awarded applicants included greater numbers of young and early career researchers, and selected proposals included ones which would not have been awarded based on ranked scores. At FWF, no reviewers chose to apply their ‘wildcard’ in any of the three funding opportunities where it was used.

At the VW foundation, only 11 out of 183 possible grants (6%) have been awarded on the basis of a wildcard. One important effect of the wildcard option was to save time in the meetings, when two opposing opinions could not be resolved by further deliberation.

There is, however, a strong risk of cronyism, which means that conflicts of interest need to be monitored extremely carefully, and anonymised reviewing needs to accompany schemes where a ‘wildcard’ system is used. Both are likely necessary as even in anonymised reviewing, peer reviewers or panellists may still be able to infer the identity of the applicant based on the topic or approach. In addition, giving a panellist the power to outright select an application conflicts with interventions targeting subjectivity in the selection process via training.

The literature also tends to pair ‘wildcard’ systems with anonymised reviewing, so positive findings (for example, increased confidence to submit ‘braver’ ideas than usual) is likely contributed to by this double approach. The intervention has also been used only in experimental schemes thus far, setting a contextual predisposition to riskier research.

This appears to be a somewhat controversial approach. Among the sources available to us, strengths and risks are variously emphasised, with some positive and some negative verdicts.

References

Mavericks and lotteries. Avin S.

Dealing with the limits of peer review with innovative approaches to allocating research funding (PDF, 314KB). Bendiscioli S and Garfinkel M.

I won a project! García-Ruiz JM.

Alternatives to peer review in research project funding. Guthrie S, Guerin B, Wu H, Ismail S and Wooding S

What works for peer review and decision-making in research funding: a realist synthesis. Recio-Saucedo A, Crane K, Meadmore K, Fackrell K, Church H, Fraser S and Blatch-Jones A.

Fund ideas, not pedigree, to find fresh insight. Sinkjær T.

Interviewees and survey responses

No survey responses

Three interviews

Partial randomisation

Main intended aims

The main aims are to:

remove bias
reduce panel burden

Main hazards

The main hazard is reputational impact on applicants.

Evidence strength

Three star

Findings

Literature, survey and interviews all suggest that partial randomisation aims to remove bias (both against demographic factors and riskier ideas), and to reduce administrative burden in the selection process. Mostly the burden is mentioned in connection to ranking, but the literature suggests that it has also been used (in connection with other interventions) to enable shorter applications. The use of partial randomisation is justified by increasingly overwhelming evidence that while peer or panel review reliably identifies the very highest quality applications, as well as the ’tail’ of unsuitable low-quality ones, it tends towards arbitrary decision making in the ‘upper-midfield’ of the quality spectrum.

Evidence on this intervention is mainly from observations from real-life applications, some of which have been assessed for diversity and applicant satisfaction. However, in most cases it is too early to say anything about effect on the actual nature of the funded research. The approach is further supported with statistical analysis suggesting arbitrariness in the traditional peer review process.

The data collection identified at least six research funding bodies where partial randomisation has been used. Some assessments have been carried out on the impacts of the intervention, and at least two funders (Volkswagen Foundation (VWF) and SNSF) were found to have diversified their awardee pool. In addition, applications at BA, FWF and VWF were found to increase in response to the partial randomisation introduction. In the case of VWF, this was reportedly due to a perceived higher chance of success among applicants.

We identify two main concerns. First, (from the funders’ point of view) there is the risk of awarding lower-quality or less relevant awards. Second (from the applicants’ point of view), there is a concern of reputational impact from both rejections or successes.

The first concern is inevitable but can be mitigated by narrowing down the pool of applications to those where finding consensus among reviewers and panellists is challenging (genuinely poor-quality applications will at this stage already have been sifted out). This is typically the approach taken, and use of the term ‘partial randomisation’ rather than simply ‘lottery’ is generally preferred in order to emphasise this point.

The second concern has been approached differently. At VWF, applicants were concerned about crediting their awards (if successful) to randomisation, which was mitigated by the added use of wildcards and not disclosing which applications were awarded via what method. FWF also used wildcards in combination with partial randomisation. However, in the three opportunities of its 1,000 Ideas programme, no reviewers used the wildcard option. FWF thinks this is because panel members were concerned about their reputation in case other jury members disagreed about the value of the application supported by the wildcard. Conversely, at SNSF, all applicants were informed if partial randomisation was used in both rejection letters and award letters to ensure transparency.

While there is controversy around this intervention in general terms, all evidence on implementation is fairly positive. From the UK, we also have anecdotal evidence that the academic community’s response to NERC’s partial randomisation trial has been overwhelmingly positive.

We note also that there is considerable versatility in application, for instance in terms of how conservatively the process is used. RCN has thus far only randomised the selection of applications identical either in idea or scoring. It can (and often is) paired with applicant anonymisation to fully avoid bias (as some degree of filtering applications takes place in all instances of implementation). VWF relies on a quadruple process of anonymisation then outlines to full applications then wildcards followed by partial randomisation of those not outright selected but of good quality.

References

Mavericks and lotteries. Avin S.

The troubles with peer review for allocating research funding. Bendiscioli S.

The experimental research funder’s handbook. Bendiscioli S.

NIH peer review percentile scores are poorly predictive of grant productivity. Fang FC, Bowen A and Casadevall A.

Research funding: the case for a modified lottery. Fang FC and Casadevall A.

Alternatives to peer review in research project funding. Guthrie S, Guerin B, Wu H, Ismail S and Wooding S.

Innovating in the research funding process: peer review alternatives and adaptations. Guthrie S.

The consequences of competition: simulating the effects of research grant allocation strategies. Höylä T, Bartneck C and Tiihonen T.

Are peer-reviews of grant proposals reliable? An analysis of Economic and Social Research Council (ESRC) funding applications. Jerrim J and de Vries R.

Study on the proposal evaluation system for the EU R&I framework programme. Rodriguez-Rincon D, Feijao C, Stevenson C, Evans H, Sinclair A, Thomson S and Guthrie S.

Blind Luck – Could lotteries be a more efficient mechanism for allocating research funds than peer review? Roumbanis L.

Peer review or lottery? A critical analysis of two different forms of decision-making mechanisms for allocation of research grants. Roumbanis L.

Partially Randomized Procedure – Lottery and Peer Review. VolkswagenStiftung.

Give Chance a Chance. Walsweer T.

Interviewees and survey responses

One survey response

Six interviews

Scoring mechanisms

Includes calibration of scores, consensus vs voting, weighting.

Main intended aims

The main aims are to:

increase the relevance of funded projects to the aims
improve review quality or reliability

Main hazards

There are no confirmed hazards but it may disadvantage high-risk or high-reward applications.

Evidence strength

Four star

Findings

Consulted funders and literature point to two main variants of this intervention. The first involves applying equal weighting of some criteria to meet the specific needs of the funding scheme (for example, wider or non-academic impact, novelty). Reviewed literature and survey respondents provided examples of the use of equal weighting of scientific merit and impact, making the scoring matrix and calibration of scores more quantitative or absolute. Consulted funders claim that the intervention was successful in making sure the right applications were funded considering the importance of impact. The intervention also appears to have been successful in that panels followed the criteria and funder instructions instead of any other considerations (for example, applying or balancing criteria as they would in ‘ordinary’ schemes).

In the second variant of this intervention, reviewers may apply their own interpretation of criteria and weightings when scoring proposals. According to the literature, calibration of scores (disclosure of scores and discussion to calibrate scores between reviewers) has been found to have the effect of converging scores within a panel, but not an increase in relatability overall (as tested in experiments with multiple panels scoring the same proposals).

Our research has not identified any known hazards of this intervention.

The literature points out that, depending on the situation, the two objectives for the use of this intervention (increase reliability and meet specific funding needs) might lead to opposite recommendations. For instance, to increase reliability, one might recommend calibration and elimination of outliers, whereas to identify and fund novel research, one might want to prioritise proposals with highly variable scores.

References

Peer review of grant applications: criteria used and qualitative study of reviewer practices. Abdoul H, Perrey C, Amiel P, Tubach F, Gottot S, Durand-Zaleski I, Alberti C.

Alternatives to peer review in research project funding – 2013 update. Guthrie S, Guerin B, Wu H, Ismail S, Wooding S.

Are peer-reviews of grant proposals reliable? An analysis of Economic and Social Research Council (ESRC) funding applications. Jerrim J and de Vries R.

Commensuration bias in peer review. Lee CJ.

Your comments are meaner than your score’: score calibration talk influences intra- and inter-panel variability during scientific grant peer review. Pier EL, Raclaw J, Kaatz A, Brauer M, Carnes M, Nathan MJ and Ford CE.

Interviewees and survey responses

Two survey respondents

No interviews

Sequential application of criteria (rather than simultaneous application)

Main intended aims

The main aims are to:

ensure application of all criteria
increase relevance to programme aims

Main hazards

None known

Evidence strength

Two star

Findings

This intervention is typically related to two-stage approaches, pre-proposals or both. The literature shows that funders can use this for programmes with a complex set of aims (typically research excellence as well as non-academic relevance), where assessing based on one set of criteria first can help reduce the burden of subsequent rounds.

The Dutch NWO’s ‘Veni’ programme required the submission of a CV and track record along with an initial short proposal. This first round thus included an assessment of the set of criteria related to the researchers’ qualities. This allowed a reduction in the number of applications progressing to the subsequent round focused on an assessment of the proposal and potential impact. This is reported to have worked well and reduced the assessment time of reviewers by 25%.

We note at this point the evidence on this form of criteria application is too limited to arrive at a strong verdict. However, it needs to be considered alongside the aforementioned two-stage approaches, pre-proposals or both, which often implicitly take this approach (however, they do not always do so, hence we mention this intervention here separately).

References

Science Europe study on research assessment practices. del Carmen Calatrava Moreno M, Warta K, Arnold E, Tiefenthaler B, Kolarz P and Skok S.

UKRI Research and Innovation Funding Service (RIFS) visioning work – Final report (PDF, 1.5MB). Kolarz P, Bryan B, Farla K, Krčál A, Potau X and Simmonds P.

How research funders ensure the scientific legitimacy of their decisions: investigation in support of the design of Formas scientific management. Kolarz P, Arnold E, Davé A and Andréasson H.

Interviewees and survey responses

No survey responses or interviews

Use of quotas

After ranking, proposals are reviewed to ensure sufficient numbers in certain categories including quotas related to protected characteristics, place, first-time applicants, etc.

Main intended aims

The main aim is to avoid or counteract bias and ‘clustering’.

Main hazards

The main hazard is that this is a very drastic approach.

Evidence strength

One star

Findings

Quotas are a means of avoiding clustering of investments in places or themes and ensuring equitable success rates among disadvantaged and minority researcher populations.

Our research indicates that funders use quotas to ensure diversity among reviewers and panel members (see also intervention on ‘embedding EDI in assessment’ below). Some literature items discuss applying quotas as an option at the decision point, but we found no evidence of implementation. Some literature items point to this measure being too drastic and that funders can achieve diversity through other means (for example, working with underrepresented groups, partial randomisation, anonymisation).

References

Addressing racial disparities in NIH funding. Comfort N.

Grant allocation disparities from a gender perspective: literature review. Cruz-Castro L and Sanz-Menéndez L.

The modified lottery: formalizing the intrinsic randomness of research funding, accountability in research. De Peuter S and Conix S.

Interviewees and survey responses

One survey response

No interviews

Main findings: interventions in training and feedback

Bringing in reviewers from earlier careers and providing mentoring

Main intended aims

The main aims are to:

improve review quality
diversify reviewers

Main hazards

None known

Evidence strength

Two star

Findings

Funders use this intervention to improve review quality, reduce burden, diversify the pool of reviewers and provide career support to early career researchers (ECRs). Most consulted funders involve ECRs in the peer review and one (CRUK) invites ECRs to observe panels and committees. One funder (Wellcome) has specific targets for the number of ECRs on its panels.

All consulted funders report positive feedback from involved ECRs. The experience helped them to learn about the process, improve their grant writing skills and made the assessment process more transparent.

All funders report significant interest and demand from ECRs to be involved in the peer review. This results in improved ability of funders to secure reviewers (because of the larger pool available). Similarly, all sources available to us report that ECRs provide very good quality reviews and are very enthusiastic. Though most evidence is based on funder observation rather than controlled experiments, there is general agreement on the effectiveness of this intervention.

References

None

Interviewees and survey responses

Three survey responses

Three interviews

Embedding EDI in assessment

Training or support provided to make assessors aware of their unconscious biases and to encourage them to call each other out during the assessment process.

Main intended aims

The main aims are to:

reduce bias
increase diversity among awardees

Main hazards

The main hazards are that ineffective training may install a false sense of confidence.

Evidence strength

Two star

Findings

Funders introduce this intervention to reduce bias, enable fair decisions and improve diversity in the funded portfolio.

Consultees pointed out that there is no way to remove bias entirely, but they feel that highlighting potential issues (via training) helps. It is difficult to demonstrate the effectiveness of this intervention because it is hard to attribute positive changes to a single intervention (hence the relatively low star-rating for this intervention despite many sources). But anecdotal feedback from panel members is that it does make them question their biases and decisions. For example, since anti-bias training for the Austrian FWF’s board (‘Kuratorium’), every board meeting now starts with a one-slide reminder about bias and the need to call it out.

Submissions and success rates by demographics are periodically reviewed in organisations including the UK research councils. In different funding institutions in Canada, finding improvements in the diversity of the funded portfolio (though as noted, causality is difficult to confirm).

We find only anecdotal evidence from funders, in some cases based on monitoring data, but it is a challenge to attribute change to a single intervention. There is nevertheless broad consensus about the relevance of this intervention despite difficulty attributing change directly.

Available material discussing the effectiveness of this type of training offers mixed views and cautions against blind reliance on it. Training approaches vary greatly, and ineffective training may harmfully install a false sense of confidence. The reliability of self-reported results has also been questioned. Other points of criticism assess its impact at the institutional level and note that bias training should only form part of a more holistic approach addressing the bigger picture. That said, there is little evidence addressing unconscious bias training in grant peer review specifically (the majority of the material on the effect of unconscious bias training focuses on general professional or healthcare settings).

We note that there are several aspects to EDI, and a wide range of techniques that may be implemented. Some additional forms of embedding EDI at UKRI have included reducing the number of proposals at panel and increasing the number of breaks to help with cognitive overload, silent scoring, and management of any unacceptable behaviours or comments at the panel. Some councils have also had diversity panel targets in place since May 2016 (for example, panels aim to meet at least 30% representation of the underrepresented gender) and there are gender and ethnicity targets to increase diversity of perspective in assessment. (For current guidance, see: Evolving and upholding fairness in peer review) These targets have since been met or surpassed.

While not fully related to this intervention, we note that all interviewed funders try to ensure balanced panels, and two funders (Wellcome and CRUK) have introduced diversity targets and quotas. Wellcome achieved the targets in 2022, which has been a big success, as it brings in a much broader breadth of voices. Wellcome did not have any challenges securing diverse membership, but there is a tendency to go to the same reviewers. Regarding the impact this intervention has had on the diversity of the portfolio, it is too early to tell. Wellcome has not seen a significant shift in, for example, the proportion of ethnic minority groups. The measure was introduced because it was deemed the right thing to do and to improve the diversity of voices. This intervention has become a new normal at Wellcome and will not change.

References

Inequality in early career research in the UK life sciences. Dias Lopes A and Wakeling P.

Understanding our portfolio: a gender perspective. EPSRC.

Interventions designed to reduce implicit prejudices and implicit stereotypes in real world contexts: a systematic review. Fitzgerald C, Martin A, Berner D and Hurst S.

Scientists from minority-serving institutions and their participation in grant peer review. Gallo SA, Sullivan JH and Croslan DJR.

The problem with implicit bias training. Green T and Hagiwara N.

The state of women in life sciences at McGill: a summary and report of the Win4Science Forum (PDF, 918KB). Hillier E, Razack S, Clary V and Münter L.

Who resembles a scientific leader – Jack or Jill? How implicit bias could influence research grant funding. Kolehmainen C and Carnes M.

The good, the bad, and the ugly of implicit bias. Pritlove C, Juando-Prats C, Ala-Leppilampi K and Parsons J.

Does gender bias still affect women in science? Roper RL.

Assessment of potential bias in research grant peer review in Canada. Tamblyn R, Girard N, Qian CJ and Hanley J.

Unconscious bias and diversity training – what the evidence says. The Behavioural Insights Team

Is there gender bias in research grant success in social sciences? Hong Kong as a case study. Yip PSF, Xiao Y, Wong CLH and Au TKF.

Interviewees and survey responses

Three survey responses

Four interviews

Expanding or reducing feedback to unsuccessful applicants

Different levels of feedback may be provided on unsuccessful applications.

Main intended aims

The main aims are to:

improve transparency
improve applicants’ learning from unsuccessful applications

Main hazards

The main hazards are:

an added burden
that feedback may be of inconsistent quality

Evidence strength

Two star

Findings

Consulted funders use feedback mainly to explain decisions and thus increase the transparency of the assessment process. Some share feedback only if an applicant requests it. A secondary aim is to ensure there can be a better learning process for unsuccessful applicants.

Literature on the subject is limited but one study shows that well-developed, good-quality feedback helps applicants to improve the quality of future applications. Consulted institutions reported that in rare instances when unsuccessful applicants receive feedback, it is very helpful, and encouraged the funders do so more.

One funder changed the presentation of feedback by sending panel members’ written comments verbatim instead of a summary of the panel discussion. This was not effective because applicants received several sets of comments which can conflict with each other, making it difficult for applicants to understand the rationale for the decision on their application.

A notable hazard is that it is hard to be consistent and equitable with the type of feedback given, as the quality of feedback may differ at least slightly. Additionally, consolidating, checking and distributing feedback creates an additional burden for the funder. There is therefore a trade-off here between transparency and learning on one hand and reduced burden on the other.

Evidence is mostly anecdotal, based on funder observations and one survey of applicants. However, there generally seems to be appetite for more feedback on unsuccessful application. Given the added burden, there is a case to consider carefully whether feedback is more useful in some funding schemes or for some applicant types than for others.

References

Targeted, actionable and fair: reviewer reports as feedback and its effect on ECR career choices. Derrick GE, Zimmermann A, Greaves H, Best J and Klavans R.

Study on the proposal evaluation system for the EU R&I framework programme. Rodriguez-Rincon D, Feijao C, Stevenson C, Evans H, Sinclair A, Thomson S and Guthrie S.

Interviewees and survey responses

One survey response

Five interviews

Funder representation on review panels

The funder is represented on the panel to guide discussion or provide briefing on programme aims. Their role is beyond a purely administrative function, they may even be in a chair role or similar.

Main intended aims

The main aims are to:

ensure guidance is followed
help ensure the relevance of decisions to scheme aims

Main hazards

None known

Evidence strength

Two star

Findings

Funders are usually represented on review panels to ensure the panels follow the guidance and to document the process but not in an advisory or chair role. In this intervention, representatives of the funder take a more active role in communicating scheme aims and ensuring that the review and discussion stay focused on the scheme’s main criteria. This may happen at the start of panel meetings, but may also involve prior briefings, as well as reminders and steering while discussion of applications is taking place. However, even within this intervention, funder representatives generally do not have a role in making the funding recommendations as such (they may steer but they do not have a ‘vote’).

We note that ‘funder representation’ is a term that is somewhat open to interpretation. For example, in the Austrian FWF’s Emerging Fields scheme, the FWF board is involved in the decision making at several points in the multi-stage decision-making process. Its members are based at various research performing organisations. However, the Executive Board president chairs these board meetings. This individual has a strong academic track and experience but is closely familiar with the funder’s strategy and operations. The Emerging Fields scheme is currently subject to evaluation and so the effects of this form of funder representation are at this point unknown.

A more ‘clear-cut’ type of funder representation on panels occurs at the Human Frontier Science Program (HFSP), which has undergone a full organisational and process review recently. The review found that, in line with its stated objectives, the HFSP process successfully identifies the most innovative, ‘frontier’ research ideas and recommends them for funding. However, the review further found that it relies primarily on culture rather than process structure to achieve this, and that this ‘HFSP-culture’ is in part perpetuated through the presence and input of secretariat staff at the panel meetings. While secretariat staff are not involved in the decision making itself, they ensure through briefing both the panel in general and new panellists individually about the purpose of the programme and the emphasis on ‘frontier’ research that panellists are expected to identify and reward.

A further example worth noting (though it does not constitute ‘funder representation’ in the strict sense) is ESRC’s Transformative Research scheme. In its third round, an awardee from the first round two years prior was selected as panel chair, with the aim of ensuring a cultural understanding of the scheme aims (and therefore of the criteria and how to select applications) would be ingrained in the panel as much as possible. While this chair did not represent the funder (ESRC) as such, their previous involvement with the scheme meant that they could be an important voice to communicate the scheme aims to the rest of the panel. Evaluation of the scheme found that, alongside other process innovations, this selection of panel chair played an important role in maintaining the panel’s focus on the ‘transformative’ element of submitted applications.

While evidence on this intervention is relatively limited, there is agreement among consultees that funder presence on panels helps to ensure panels follow the funder guidance and thus improves the quality of the assessment, and some evaluative evidence points in the same direction.

References

Evaluation of the ESRC Transformative Research Scheme (PDF, 1.5MB). Kolarz P, Arnold E, Farla K, Gardham S, Nielsen K, Rosemberg C and Wain M.

Organisational and process review of the Human Frontier Science Program. Kolarz P, Davé A, Bryan B, Urquhart I, Rigby J and Suninen L.

Interviewees and survey responses

No survey responses

Two interviews

Improving quality of reviews

Through training, retaining good reviewers or recognition. Peer Review colleges fit here too.

Main intended aims

The main aims are to:

improve quality of reviews
simplify training
increase response rate for review requests

Main hazards

None known

Evidence strength

Four star

Findings

Funders use training and peer review colleges to improve the quality of reviews. The literature on this intervention mostly uses reviewer agreement as a proxy for improved quality of the review. Peer review colleges are also seen as a tool to address peer-review fatigue and to increase reviewer response rates.

However, some sources speculate about high disagreement being an indication of the high-risk/high-reward nature of an application. We note this to indicate that, while the overall evidence base on this intervention is strong, there is some disagreement about whether this common form of measuring its success might have some limitations.

One controlled trial at the US National Institutes of Health showed that a training programme to increase inter-rater reliability improved scoring accuracy and reviewer agreement. Consulted funders also report that use of a peer review college provides a large number of reviewers to approach initially who are familiar with the scheme and have a proven track record of providing good reviews. Additional training has been useful and that has been developed based on common review errors.

An ongoing example of the above can be found at EPSRC, whose peer review college consists of more than 6000 members, all of whom have undergone online training upon joining the college. The training, among other factors in the membership, is stated to help members to increase their knowledge of proposal writing and reviewing.

Funders note that they receive more reviews per request from college members compared to ‘cold’ peer review invites as well as a higher percentage of reviews of suitable quality.

Evidence on this intervention is fundamentally strong, there are controlled trials reported in literature (about training) and funder observations and monitoring data on positive responses to peer review requests and improved review quality.

References

ESF survey analysis report on peer review practices (PDF, 6.4MB). ESF

What do we know about grant peer review in the health sciences? Guthrie S, Ghiga I and Wooding S.

Individual versus general structured feedback to improve agreement in grant peer review: a randomized controlled trial. Hesselberg J-O, Fostervold K I, Ulleberg P and Svege I.

Global state of peer review 2018. Publons and Clarivate Analytics.

What works for peer review and decision-making in research funding: a realist synthesis. Recio-Saucedo A, Crane K, Meadmore K, Fackrell K, Church H, Fraser S and Blatch-Jones A.

Grant peer review: improving inter-rater reliability with training. Sattler D N, McKnight P E, Naney L and Mathis R.

Peer review of health research funding proposals: a systematic map and systematic review of innovations for effectiveness and efficiency. Shepherd J, Frampton G K, Pickett K, Wyatt J C.

Interviewees and survey responses

Two survey responses

Three interviews

Open review or rebuttal

Reviews are published or made available to the applicant, or both, before funding decisions are taken, so they can be viewed and responded to.

Main intended aims

The main aim is to increase accountability and review quality.

Main hazards

The main hazard is that it possibly increases the burden for the funder (and longer timelines depending how rebuttal works).

Evidence strength

Three star

Findings

Open reviews and rebuttals have been particularly well-known elements in journal peer review but are also becoming recognised as potential tools for grant peer review. Open peer review is considered an umbrella term inclusive of open identities, open reviews and open interaction. They are expected to lead to increased accountability, challenging unjust reviews, giving applicants more voice in the process and increasing overall review quality. The latter is particularly enabled by applicants being able to clarify in case reviewers have genuinely misunderstood some of the application’s content, which may be especially important where English is not the applicant’s first language. Open identities are also hoped to contribute to the credit of the reviewer.

Consulted funders were positive about the intervention as it is well received by the applicants and reviewers and helps to increase the transparency of the process.

There are, however, some opposing voices. Concerns have been raised about the potential for reduced rigour and valid criticism where the identities of the reviewers are made known. Literature also points to a potential (but not evidenced) increase in burden, though consulted funders did not raise this concern.

A prominent example of the ongoing use of rebuttals is at NWO (Dutch Research Council), where reviews are shared with applicants, who then have one week to produce a short rebuttal to reflect on issues raised by reviewers. These rebuttals will be reviewed along with the review by the panel. In other words, the rebuttals may influence the funding recommendation. One reviewed study suggests that this rebuttal stage may have a corrective effect on some degree of gender bias in the review process.

A similar process is also used at UKRI. There is a 10 working day turnaround time for lead applicants to provide a rebuttal (recently extended from five working days to take into account EDI concerns). These may then influence the final funding decision. They are provided to the panel alongside the written reviews to aid their decision making.

References

Attitudes and practices of open data, preprinting, and peer-review-A cross sectional study on Croatian scientists. Baždarić K, Vrkić I, Arh E, Mavrinac M, Marković M G, Bilić-Zulle L, Stojanovski J and Malički M.

Gender-equal funding rates conceal unequal evaluations. Bol T, de Vaan M and van de Rijt A.

Horizon Europe and Plan B research funding: turning adversity into opportunity. Cavallaro M.

Strategic advice for enhancing the gender dimension of Open Science and Innovation Policy. Gender Action.

Innovating in the research funding process: peer review alternatives and adaptations. Guthrie S.

Response by the Author. Hill M.

How research funders ensure the scientific legitimacy of their decisions: investigation in support of the design of Formas scientific management. Kolarz P, Arnold E, Davé A and Andréasson H.

What works for peer review and decision-making in research funding: a realist synthesis. Recio-Saucedo, A. Crane K, Meadmore K, Fackrell K, Church H, Fraser S and Blatch-Jones A.

Interviewees and survey responses

One survey response

Two interviews

Additional interventions identified by our review

Our data collection focused on the 38 interventions that served as a baseline for this review. However, while going through the literature on the set of 38 and running several consultations, we identified some other interventions not included in the initial list. Two of these are recently introduced interventions facilitated by technological advances and increased use of information technology to support the peer review process. A third summarises various actions to improve behaviours and culture or supporting EDI in interviews and panel meetings. We briefly describe these interventions here.

Roving panel members

The UKRI Future Leaders Fellowships scheme uses panels of experts from across the research and innovation system to consider all assessment criteria, and the panel members are roving. Panel members change panels through the assessment process to ensure consistency and quality between panels. The intervention is easy to introduce, and according to programme staff, it makes a significant difference in ensuring consistency across the panels.

Discussion boards

BBSRC uses discussion boards (shared virtual platforms for information exchange) to reduce ‘on the day’ peer review pressure because the discussion has already happened three weeks period before the panel meeting online. In the actual panel meeting, reviewers only have to discuss outstanding issues and agree a ranked list. Discussion boards allow panel members to be flexible with the time they commit to the review. Discussion boards support increased transparency of pre-panel meeting work and discussion and improve benchmarking of scoring before the panel meeting. Our consultees pointed out that discussion boards help to remove the ‘corner of the room’ discussions that happen at in-person meetings and might not be transparent and cannot be challenged. Furthermore, discussion boards enable clear and detailed feedback to applicants because all discussions are recorded.

Use of videos

Several consulted funders reported using videos as part of the application process. For example, UKRI uses video clips along with short application forms at the EOI stage of the Healthy Ageing Catalysts programme. FWF uses videos for pre-selection before soliciting full applications in its Momentum programme. The programme funds researchers one to two years after tenure to provide a boost for their career. The programme received many applications, and FWF decided to introduce pre-applications and used a three-minute video application for the first assessment stage. The accompanying evaluation found no bias in the assessment based on the video format (for example, showing diagrams, not showing the speaker, etc.), and reviewers were happy with the format. The video format helped to keep the burden for reviewers low.

Improving culture or supporting EDI in interviews and panel meetings

Our consultation also reveals several small interventions or tweaks to the assessment process, all aimed at improving assessment culture and supporting EDI. None of these modifications has the potential to improve the process alone, but in combination with other measures, these small interventions can potentially have a positive impact. For example, UKRI’s Future Leaders Fellowships programme introduced silent reflection periods in interviews (adapted from prior use by EPSRC in 2016 to 2017). After the interview, a two-minute silent period is mandated when no one can speak. A silent period helps to stop initial verbal reactions about how good or bad the interview was, which may otherwise affect the rest of the discussion. The silent period is intended for panel members to reflect on what they have heard and develop reasoning for their grading.

In the same programme, UKRI introduced the numbers-first approach. This means that panel members first give their grades without commentary to avoid the risk of them changing their views and grades because other panel members have different or more loudly expressed opinions. UKRI observed improved panel discussion quality after introducing the above measures.

Summary and recommendations

Our headline findings are noted at the outset of this report. We have summarised our list of recommendations resulting from our research in a summary table of aims, hazards and evidence strength. It shows how each of the 38 interventions relates to the seven main aims posited at the start, as well as the main hazards of each intervention, and our evidence strength rating.

Recommendations

There are several recommendations stemming from our research. An initial draft of these was discussed and slightly refined at a validation meeting with UKRI in April 2023.

We note that the recommendations below are not specific to UKRI. They are intended as recommendations of good practice for any organisation involved in R&I funding.

Recommendations on how to use the interventions

Our headline recommendation is that process design should always be a constituent part of scheme design. The standard review process posited at the start of this report (submission, eligibility check, two to three external reviews, panel review, decision) should never be a ‘default’. Every funding scheme has specific aims and characteristics, and so the design of the application, review and decision-making process should be considered for each individual funding opportunity.

We encourage funders to make extensive use of the interventions studied here and to vary their assessment processes widely. Our review shows that some highly effective interventions (for example, two-stage processes, encouraging positive behaviours, interactive assessment processes) in achieving desired outcomes still require additional staff effort, which can be challenging in resource constraints. However, plenty of interventions also present opportunities for resource savings (for example, using automation-assisted peer allocation, virtual panels, and partial randomisation). Therefore, funders can strategically review the mix of their funding portfolio and use interventions appropriate for the objectives of specific funding schemes and seek balanced use of interventions in terms of the resources required. For example, resources saved by introducing partial randomisation or panel-only approaches for smaller grants can be used to run two-stage processes and recruit non-academic reviewers in programmes that fund projects with extra-scientific objectives.

It is worth noting that such diversification may create a high cognitive load for both funder staff and researchers. In order to facilitate such diversification, it is therefore important that funders have the necessary resources and modernised systems needed to implement interventions as easily as possible. This likely constitutes an important confluence point between this study and other recent work in the UK and beyond on research bureaucracy and research culture.

There are many reasons to reduce bureaucracy and change research culture, and doing so will likely also create conditions where interventions to peer review processes can be implemented more easily.

Most critically, to ensure our recommended level of variation is possible, IT systems need to have the necessary flexibility and function. Funders’ application and review management systems (the IT underpinning the process) need to be designed in such a way that the interventions can easily be integrated into every bespoke scheme setup. While this is not a prerequisite for all 38 interventions studied here, it plays a part in many of them. Outdated, overly rigid IT systems may risk stifling funders’ ability to vary and optimise their processes.

Critically, we note that the judgement of experienced R&I funder staff is critical. Almost every intervention we have considered has advantages as well as potential hazards and drawbacks. Our research can give extensive guidance on which interventions might suit a particular funding scheme, but scheme design is not a mechanical process with ‘only one right answer’.

Most interventions studied here are suitable for specific contexts and should not be rolled out across all R&I funding opportunities. Indeed, a small number have extremely limited applicability (use of quotas, metrics, dragon’s den pitches). However, some interventions have the potential to become a ‘new normal’ in order to save burden and reduce bias across the board.

Providing additional support to groups unrepresented in the funder’s portfolio to encourage them to apply and support them may be used by funders to improve diversity. Of the interventions aiming to support greater inclusion, working with underrepresented groups is the one with the highest demonstrated evidence strength. The actual implementation may vary from more sophisticated actions, including hands-on support, to less involved actions, like simply stating in the funding opportunity document that the unrepresented groups are encouraged to apply. Both approaches are shown to be effective

Use of peer review colleges (and the training/briefing opportunities they entail) may be a good default practice to improve review quality. Where the expertise represented on such colleges does not cover certain applications, there must however remain the possibility to recruit reviewers beyond the college. Funders should ensure the college membership is diverse (for example, open to ECRs) and open to new participants.

On a related note, automated reviewer allocation may become a genuine opportunity for saving administrative burden, avoiding conflicts of interest and increasing reviewer response rates. Experience-sharing among funders will be important here, especially in relation to which systems have been proven to work. Peer review colleges combined with automation-assisted reviewer allocation would bring additional benefits

There is a good case to substantially expand the use of anonymised reviewing. Most funding schemes likely need at some stage to scrutinise the track record of applicants, but in multi-stage assessment processes and for smaller awards (where risk levels are lower), having at least parts of the process anonymised would help reduce bias and inequitable outcomes

While often seen as a ‘radical’ innovation in R&I funding, there is a good case to mainstream an element of partial randomisation across most R&I funding endeavours. This should not be extensive and should not cover all or even the majority of funding decisions: expert judgement through peer and panel review does well at identifying the very best applications, as well as the ‘tail’ of unsuitable ones. However, having partial randomisation as a consistently available option would enable some time savings and counteract bias, both against underrepresented groups and high-risk or high-reward ideas.

As a minimum, randomisation should be used in cases where applications are of indistinguishable quality so as to avoid excessive and laboured discussion. Funders may however go further and randomise among a larger subset of high-quality applications where panels struggle to reach agreement

Recommendations on testing and further research

For some of the interventions covered in this report, there is limited evidence of their effectiveness simply because they have not been empirically studied to a sufficient degree. Virtual panels are potentially the most telling example.

Many research funders have widely adopted virtual panels since the COVID-19 pandemic and report that this has become a ‘new normal’ because of the time savings and associated improved ability to secure panel membership and diversity. While these gains are obvious and valuable, evidence of the impact on the discussion quality is scarce and requires further research. However, this should not necessarily discourage R&I funders from considering the interventions.

For both well-tested and more embryonic interventions, we recommend that funders monitor any interventions they undertake, and where possible compare them to a pre-intervention baseline or to other funding schemes running in parallel. Importantly, funders should share good practice with their peers so that successes can be mainstreamed.

To counter the perceived risk that might accompany the innovative use of the interventions, we recommend that funders first test the intervention on a smaller scale via a pilot opportunity or by commissioning accompanying process evaluations. If funders introduce the intervention to an existing programme, then evaluation or simply review of monitoring data comparing the processes and outcomes pre- and post-intervention can be organised.

The comparison allows for detecting the benefits (or lack of), improving the process and making a case for the decision makers. Most evaluations of the interventions rely on programme monitoring data analysis, programme staff and stakeholder (applicants and reviewers) consultation and complete the evaluations during or right after the funding opportunities that introduce new interventions.

Our review shows that some interventions (demand management, shortening applications) can reduce the burden for the funder but not the system because the burden is simply shifted elsewhere, for example, to the research community, to institutions, or to other funders. Therefore, R&I funders should follow up and assess the effects of the interventions on these wider constituencies.

Recommendations beyond the interventions

Our review reveals that the assessment process can be improved with various interventions. However, procedural changes alone cannot fix wider systemic problems that may exist in research culture. Often interventions can go some way to enable improved outcomes, but wider problems of research culture may persist and even dampen the capacity for the interventions to achieve their greatest possible effect. We note this in particular because there have been great efforts by many funders and experts in recent years to assess and improve many elements of research culture, and our findings here should not be read as alternative ‘quick fixes’ to those important endeavours. Investigations into wider research culture categorically need to continue alongside the process interventions discussed in this report.

Page viewed: 6:04 am on 27 July 2024