Science Article with an Analysis of NIH Peer Review

(by datahound) Apr 23 2015

In the current issue of Science, Li and Agha present an analysis of the ability of the NIH peer review system to predict subsequent productivity (in terms of publications, citations, and patents linked to particular grants). These economists obtained access to the major NIH databases in a manner that allowed them to associate publications, citations, and patents with particular R01 grants and their priority scores. They analyzed R01 grants from 1980 to 2008, a total of 137,215 grants. This follows on studies (here and here) that I did while I was at NIH with a much smaller data set from a single year and a single institute as well as a publication from NHLBI staff.

The authors' major conclusions are that peer review scores (percentiles) do predict subsequent productivity metrics in a statistically significant manner at a population level. Because of the large data set, the authors are able to examine other potentially confounding factors including grant history, institutional affiliation, degree type, career stage) and they conclude the statistically significant result persists even when correcting for these factors.

Taking a step back, how did they perform the analysis?

(1) They assembled lists of funded R01 grants (both new (Type 1) and competing renewal (Type 2) grants from 1980 to 2006.

(2) They assembled publications (within 5 years of grant approval) and citations (through 2013) linked to each grant.

(3) They assembled patents linked either directly (cited in patent application) or indirectly (cited in publication listed in application) for each grant.

There are certainly challenges in assembling this data set and some of these are discussed in the supplementary material to the paper. For example, not all publications cite grant support and other methods must be used. Also, some publications are supported by more than one grant and, in this case, the publication was linked to both grants.

The assembled data set (for publications) is shown below:

Science Figure

By eye, this shows a drop in the number of linked publications with increasing percentile score. But this is due primarily to the fact that more grants were funded with lower (better) percentile scores over this period. What does this distribution look like?

I had assembled an NIH-wide funding curve for FY2007 as part of the Enhancing Peer Review study (shown below):

NIH EPR Figure

To estimate this curve for the full period, I used success rates and numbers of grants funded to produce the following:

R01 funding curve graph

Of course, after constructing this graph, I noticed that Figure 1 in the supplementary material for the paper includes the actual data on this distribution. While the agreement is satisfying, I was reminded of a favorite saying from graduate school: A week in the lab can save you at least an hour in the library. This curve accounts (at least partially) for the overall trend observed in the data. The ability of peer review scores to predict outcomes lies in more subtle aspects of the data.

To extract the information about the role of peer review, the authors used Poisson regression methods. These methods assume that the distribution of values (i.e. publications or citations) at each x-coordinate (i.e. percentile score) can be approximated as a Poisson distribution. The occurrence of such distributions in these data makes sense since they are based on counting numbers of outputs. The Poisson distribution has the characteristic that the expected value is the same as its variance so that only a single variable in necessary to fit the trends in an entire curve that follows such a distribution. The formula for a Poisson distribution at a point k (an integer) is f = (λ^k*e^-λ)/k!. Here, λ corresponds to the expected value on the y axis and k corresponds to the value on the x axis.

Table 1 in the paper presents "the coefficient of regression on scores for a single Poisson regression of grant outcomes on peer review scores." These coefficients have values from -0.0076 to -0.0215. These values are the β coefficients in a fit of the form ln(λ) = α + βk where k is the percentile score from 1 to 100 and λ is the expected value for the grant outcome (e.g. number of publications).

From the paper, a model which includes corrections for five additional factors (subject-year, PI publication history, PI career characteristics, PI grant history, and PI institution/demographics (see below and supplementary material for how these corrections are included)), the coefficient of regression for both publications and citations is β = -0.0158. A plot of the value of λ as a function of percentile score (k) for publications (with α estimated to be 3.7) is shown below:

Distribution b=-0.0152 plot

The shape of this curve is determined primarily by the value of β.

The value of λ at each point determines the Poisson distribution at the point. For example, in this model at k=1, λ=39.81 and the expected Poisson distribution is shown below:

Poisson distribution-k=1 plot

There will be a corresponding Poisson distribution at each percentile score (value of k). These distributions for k=1 and k=50 superimposed on the overall curve of λ as a function of k (from above) are shown below:

Distribution plot curves

This represents the model of the distributions. However, this does not take into account the number of grants funded at each percentile score shown above. Including this distribution results in an overall distribution of the expected number of publications as a function of percentile score corresponding to this model shown as a contour plot below (where the contours represent 75%, 50%, 25%, 10%, and 1% of the maximum density of publications):

Poisson Curves Plot

This figure can be compared with the first figure above with the data from the paper. The agreement appears reasonable although there appear to be more grants with a smaller number of publications than would be expected from this Poisson regression model. This may reflect differences in publication patterns between fields, the unequal value of different publications, and differences between the productivity of PIs.

With this (longwinded) description of the analysis methods, what conclusions can be drawn from the paper?

First, there does appear to be a statistically significant relationship between peer review percentile scores and subsequent productivity metrics for this population. This relationship was stronger for citations than it was for publication numbers.

Second, the authors studied the effects of correcting force various potential confounding factors. These included:

(i) "Subject-year" determined by correcting for differences in metrics by study section and by year as well as by funding institute. This should at least partially account for differences in fields although some study sections review grants from fields with quite different publication patterns (e.g. chemistry versus biochemistry or mouse models versus human studies).

(ii) "PI publication history" determined by the PIs publication history for the five years prior to the grant application including the number of publications, the number of citations up to the time of grant application, the number of publications in the top 0.1%, 1% and 5% in terms of citations in the year of applications and these same factors limited to first author publications or last author publications.

(iii) "PI career characteristics" determined by Ph.D., M.D., or both, and number of years since the completion of her/his terminal degree.

(iv)  "PI grant history" categorized as one previous R01 grant, more than previous R01 grant, 1 other type of NIH grant, or 2 or more other NIH grants.

(v) "PI institution/demographics" determined as whether the PI institution falls within the top 5, top 10, top 20, or top 100 institutions within this data set in terms of the number of awards with demographic parameters (gender, ethnicity (Asian, Hispanic) estimated from PI names.

Including each of the factors sequentially in the regression analysis did not affect the value of β substantially, particularly for citations as an output. This was interpreted to mean that the statistically significant relationship between percentile score and subsequent productivity metrics persists even correcting for these factors. In addition, examining results related to these factors revealed that (from supplementary material):

"In particular, we see that competing renewals receive 49% more citations, which may be reflective of more citations accruing to more mature research agendas (P<0.001). Applicants with M.D. degrees amass more citations to their resulting publications (P<0.001), which may be a function of the types of journals they publish in, citation norms, and number of papers published in those fields. Applicants from research institutions with the most awarded NIH grants garner more citations (P<0.001), as do applicants who have previously received R01 grants (P<0.001). Lastly, researchers early in their career tend to produce more highly cited work than more mature researchers (P<0.001)."

So what is the bottom line? This paper does appear to demonstrate that NIH peer review does predict subsequent productivity metrics (numbers of publications and citations) at a population level even correcting for many potential confounding factors in reasonable ways. In my opinion, this is an important finding given the dependence of the biomedical enterprise on the NIH peer review system. At the same time, one must keep in mind the relatively shallow slope for the overall trend and the large amount of variation at each percentile score. A 1 percentile point change in peer review score resulted in, on average, a 1.8% decrease in the number of citations attributed to the grant. By my estimate (based on the model in this paper), the odds that funding a grant with a 1 percentile point better peer review score over an alternative will result in more citations are 1.07 to 1. The slight slope and the large amount of "scatter" are not at all surprising given that grant peer review is largely about predicting the future, is a challenging process, and the NIH portfolio includes many quite different areas of science.

One disappointing aspect of this paper is the title: "Big names or big ideas: Do peer-review panels select the best science proposals?" This is an interesting and important question, but the analysis is not suited to address it except peripherally. The analysis does demonstrate that PI factors (e.g. publication history, institutional affiliation) do not dominate the effects seen with peer review, but this is does not really speak to "big names" versus "big ideas" in a more general way. Furthermore, while the authors admit that they cannot study unfunded proposals, it is likely that some of the "best science proposals" fall into this category. The authors do note that some of the proposals funded with poor percentile scores (presumably picked up by NIH program staff) were quite productive.

There is a lot more to digest in this paper. I welcome reactions and questions.

16 responses so far

Models of Support for Staff Scientist Positions-Matching Funds?

(by datahound) Apr 14 2015

Lots of interesting ideas coming in to my previous post on models of support for staff scientist positions.

Let me add an idea into the mix. Suppose that NIH were to develop a training grant-like mechanism for staff scientists with additional conditions:

(1) No more than XX% of the salary and benefits of each staff scientist position can be supported from the grant with additional funds coming from other grants or institutional funds.

(2) Any staff scientist supported from the grant is guaranteed full support for a period of YY years after the termination of the grant by the institution.

For the purposes of discussion, set XX% = 50% and YY = 2. The idea is to have both NIH and the institution have substantial "skin in the game" with the staff scientists and their positions as the beneficiaries.

As a disclaimer, I do not know enough about the legal framework that could influence how the details of this plan.

Thoughts?

 

23 responses so far

Models of Support for Staff Scientist Positions

(by datahound) Apr 13 2015

The topic of support of staff scientists has been discussed extensively recently. NCI announced its intention of initiating a new mechanism for the support of such positions (discussed extensively at Drugmonkey). A poll recently showed that 77% of respondents favored creating more staff scientists as a way of dealing with the present postdoc situation.

I recently tweeted a question about the potential of block grants as a mechanism of supporting staff scientist positions. This idea came out of discussions that I had years ago during the NIH "Enhancing Peer Review" process. As today, the discussions centered around how to stabilize staff scientist positions as a career path. The block grant model was proposed as a potential alternative to individual awards such as those to be piloted by NCI. The concerns about an individual award model were (1) the position is still only as stable as a single grant and (2) the criteria for reviewing staff scientists in conjunction with their environment (associated PI, etc.) appeared to be hard to manage. The use of larger grant to an institution supporting a cadre of staff scientists could diminish some of these concerns since such grants could be more stable and could be judged over time by criteria related to the stabilization of staff scientist positions. One obvious downside is that the institutions would be responsible for selecting the staff scientists to be supported with only indirect outside influences.

What are your thoughts about individual awards versus institutional awards for staff scientists? How could an institutional award be structured to best achieve the goals of creating a larger number of more stable staff scientist positions?

31 responses so far

Research Spending at Academic Medical Centers

(by datahound) Apr 03 2015

Have you heard an administrator say 'we lose money on research?'  That sentiment often irritates faculty members for at least two reasons. First, the use of the term "lose" implies that the purpose of doing research at a university, medical center, or research institute is to make money at it rather than because it is a key component of their reason for existence. Wouldn't the term "spend" be more appropriate? Second, it can sound like the implication is that they would rather be out of the research business because it is costing them so much money.

The American Associate of Medical Colleges (AAMC) recently released a short report that provides some insights into how the cost accounting to reach conclusions about these costs. These data are based on surveys of 46 academic medical centers. The bottom line conclusion is that a collection of academic medical centers spend 53 cents for every dollar of sponsored research support that they receive.

AAMC FIgure

These four largest categories include unrecovered indirect (facilities and administrative) costs on sponsored research, facilities and administrative costs on research supported internally, salary support for research effort not supported by sponsored research, and the costs of start-up packages. Additional categories include bridge funding, costs for salaries over the salary cap, and cost sharing either voluntarily committed or volunteered by the institution.

It is estimated that these costs average $111,151,553 with a 95% confidence interval of $21,143,786 per medical school. I am not sure how to interpret this average (aside from the significant figure issues) since the direct cost support of the 46 schools participating the study ranged from $26 million to $751 million, but this does give some sense of the distribution.

Do these figures make sense? If you were in charge, how would you try to allocate these institutional resources?

49 responses so far

NIH Request for Information (RFI): Optimizing Funding Policies and Other Strategies to Improve the Impact and Sustainability of Biomedical Research

(by datahound) Apr 02 2015

NIH just released a new Request for Information. The topic is: Optimizing Funding Policies and Other Strategies to Improve the Impact and Sustainability of Biomedical Research.

The RFI form has four sections for comment:

(2) 

(3) 

(4) 

This is a great opportunity to collect your thoughts on these central issues and share them with NIH. There is really no reason not to respond.

6 responses so far

FOIA-The Freedom of Information Act

(by datahound) Mar 31 2015

As a datahound, I often would like access to data. Ideally, relevant data are available through published reports, websites, or publicly available databases such as NIH RePORTER. However, in many cases, such data are not available. Fortunately, Congress passed and President Johnson signed the Freedom of Information Act (FOIA) in 1966. FOIA allows citizens to request record from any executive branch agency. The government much supply these records subject to a series of exceptions including national security,  personnel records, many pre-decisional memoranda, and several other categories.

A FOIA request must be relatively specific regarding the information requested. Importantly, FOIA applies only to records that already exist. A citizen cannot request (through FOIA) that an agency conduct an analysis that is has not already been done, but can request materials related if s/he knows that they exist.

Each agency has its own FOIA process. For example, information about the NIH FOIA process including how to file a request can be found here. I first used the FOIA process in an attempt to obtain information about the decision by the NIH Director and the Secretary of Health and Human Services to dissolve the National Center for Research Resources. In this case, the process was quite slow and frustrating. More recently, I use FOIA to obtain information about the age distribution of NIH New and Early Stage Investigators. In this case, the process was more rapid (although still took a couple of months to get the data) and I received almost all of what I requested.

My most recent FOIA request relates to the discussions of a potential emeritus award from NIH. Conversations with colleagues at the Experimental Biology meeting indicated that some NIH staff members had indicated that the responses to the NIH Request for Information (RFI) had been more positive that those to the Rock Talk post on the subject (which were almost uniformly negative). I have requested the responses to this RFI. Time will tell what I receive and what these records reveal about the respondent's thoughts are about this possible award.

Update:  The initial response is below:

FOIA-RFI-4-15

13 responses so far

Selected Data and Sources Relevant to Research Enterprise Sustainability

(by datahound) Mar 28 2015

I participated in the follow-up meeting to the Alberts et al. paper that was summarized in a recent PNAS paper. This summary noted that "...most were surprised to learn that the percentage of NIH grant-holders with independent R01 funding who are under the age of 36 has fallen sixfold (from 18% to about 3%) over the past three decades." This statement is probably accurate, but I was disappointed that many participants were not familiar with many important facts and trends that have affected the biomedical enterprise over the past two decades.

What information is important for individuals to know in order to participate in discussions about potential corrections to the present system. In addition to the demographic data noted above, below are some slides that I have used in presentations on the topic of the sustainability of the biomedical research enterprise (some of which are derived from posts here or from my columns at ASBMB Today.

Sustainability-1

Slide01

Slide02

Slide03

Slide04

Slide05

Slide06

Slide07

Slide08

Slide09

Slide10

Slide11

 

Which of these are most important? What are other important data sets or data sources that should be included in such presentations?

18 responses so far

More Age Data from NIH: Surprising Award Rate Data

(by datahound) Mar 27 2015

In the context of recent discussions of NIH age group data, @dgermain21 pointed to some interesting data in a recent NIH report on physician scientists regarding NIH R01 Award Rates as a function of age group (as well as degree, race/ethnicity, and gender. These data are quite surprising as shown below:

Award Rate graph

These data are for all individuals in the analysis. I have omitted the curves for individuals 30 or less and 71+ since these data are relatively noisy, presumably to relatively small numbers of individuals in these groups.

The term Award Rate is defined by NIH as "the number of awards made in a fiscal year divided by the absolute number of applications where we don’t combine resubmissions (A1s) that come in during the same fiscal year." Thus, Award Rate is lower that Success Rate since the denominator is higher.

Of course, the surprising observation is that these rates are highest for the 31-40 age and declines monotonically so that it is lowest for the 61-70 age group. This is certainly counter to what I would have expected where I would have anticipated the opposite trend or perhaps a peak for the 51-60 age group. This observation begs an explanation.

Digging into the report, the Quantitative Analysis Methodology section indicates that

"The NIH awards and time period selected for inclusion in the system from IMPACII (the large internal NIH database) were:

  • Research Project Grants for the following 25 activity codes between 1993 and 2012, Type 1 applications,..."

The term "Type 1" applications refers to new (as opposed to competing renewal) applications. This suggests that the above data may be only for these new applications. Competing renewal applications (Type 2) applications come predominantly from more senior investigators and have substantially higher success rates than new applications. Thus, the restriction to Type 1 applications would be expected improve the importance of younger relative to older investigators. This may be an important contributor to these data, although I still find it surprising that the reported trends still apply to new R01 applications.

Interested readers should look at the report and help try to understand how to interpret these data.

UPDATE

I contacted the individuals responsible for the data in this report. The missing data (zeros for PIs between 61 and 70) have been filled in, the the NIH believes that the other data are correct as posted. Thus, it appears that award rate for new (Type 1) R01s appears to decrease monotonically with increasing PI age and this was true for every year from 1999 to 2012.

7 responses so far

Rock Talk Age Data: Effective Indirect Cost Rates 1998-2014

(by datahound) Mar 26 2015

A recent post on Rock Talk presented data on the amount of funding as a function of PI age group. These data were not presented in a terribly informative way but a file was available for downloading and Michael Hendricks normalized the data by the number of PIs in age group to reveal more interesting trends, discussed at Drugmonkey.

The downloadable data includes a breakdown of Direct and Total Costs. I have been looking for such data over a longer period than the last couple of year and thought I would take a look. Below is a plot of the Effective Indirect Cost Rate ((Total Costs-Direct Costs)/Direct Costs) for the overall data set.

Overall indirect graph

The Effective Rate drops from 44.2% to a low of 37.2% in 2012 before rising slightly over the past two years. These values are all somewhat lower than I anticipated based on my previous analysis on R01s.

To try to gain some insight, I looked at these data as a function of PI age group.

Indirect cost graph

The differences between the age groups quite substantial and surprising. For the lowest three PI age groups, the Effective Rate is relatively constant around 47%, consistent with my previous R01 indirect cost analysis. For the older PI age groups, the Effective Rate falls steadily from 1998 to 2012, reaching rates as low as 27.5% for PIs 61-65 in 2012.

I certainly do not understand what underlies these trends, but differences in mechanisms could certainly be involved. It may be that mechanisms as as U01s for larger efforts could be important. As always, it would be best to see data broken down by mechanism to facilitate accurate interpretation.

Any other thoughts on these data are most welcome.

20 responses so far

NIH Institute and Center Strategic Plans

(by datahound) Mar 17 2015

The National Institute of General Medical Sciences (NIGMS) just released its new strategic plan. I was involved in writing the first NIGMS Strategic Plan released in 2008. I must admit that I was skeptical of the value to writing a strategic plan when we started down the path toward this plan. However, I found both the process and the final product useful. It allowed the institute to formalize our commitment to research and training activities, the balance of the use of particular mechanisms, and so on. Even though many aspects of the plan seemed obvious in terms of the culture of NIGMS, I found myself looking back to the plan and referring others to important sections over the next years.

The new NIGMS plan takes a similar approach. One section of interest is:

Objective 1-2: Promote the ability of investigators to pursue new research directions, novel scientific insights and innovative ideas.

This objective includes the following implementation strategies:

1. Align funding decisions with the need to create a broad and diverse research portfolio that maximizes the scientific return on taxpayers’ investments.

2. Support investigators working in underexplored areas of science that are relevant to the NIGMS mission.

3. Conduct regular analyses of NIGMS’ scientific investments to assess their efficacy, distribution and impact, and use this information to help guide programmatic and funding decisions.

4. Pilot and assess alternative mechanisms of funding that emphasize individual investigators, rather than individual projects, to better meet NIGMS goals and objectives.

5. Increase the Institute’s ability to conduct in-depth portfolio analyses and evaluation activities.

Importantly, this includes commitments to analysis and analysis capabilities (strategies 3, 5) as well as a strategy related to the new Maximizing Investigators' Research Award (MIRA) mechanism.

While I am still going though the plan, plans for other NIH Institutes and Centers are available (although some links on this site are broken).

Do you find these plans of any use? How could they be made more useful?

15 responses so far

Older posts »