Gender Balance in NIH High Risk Research Programs-2015

(by datahound) Oct 23 2015

When the NIH High Risk Research Program awardees for 2015 were announced concerns were raised about the balance of women and men among the awardees. This portfolio includes four programs:  NIH Director's Pioneer Award, NIH Director's New Innovator Award, NIH Director's Early Independence Award, and NIH Transformative R01 program.  These gender distribution among the awardees is shown below:

Pioneer: 10 men, 3 women (23% women)

New Innovator:  28 men, 13 women (32% women)

Early Independence:  13 men, 3 women (19% women)

Transformative R01:  10 men, 3 women (23% women)


Overall, women hold approximately 27% of research project grants at NIH so that the percentages in the High Risk programs do tend to be low. However, as I noted in my previous post, it is difficult to interpret these percentages without knowledge of the pool of individuals who applied to these programs.

Information about the applicant pool is not publicly available directly. However, I filed a FOIA request on October 8th and was pleased to receive the response yesterday (October 22nd, 2 weeks, record time for me...Thank you NIH staff and NIH FOIA office).

Below is the information that I received regarding the applicant pool gender composition:

Pioneer:  154 men, 49 women, 5 unknown/withheld (24% women among known)

New Innovator:  349 men, 138 women, 10 unknown/withheld (28% women among known)

Early Independence:  36 men, 26 women, 18 unknown/withheld (42% women among known)

Transformative R01:  248 men, 64 women, 17 unknown/withheld (21% women among known)


Note that gender information is not available for 18/80 = 22.5% of the applicants for the Early Independence Award. This may reflect that many of these applicants are new to NIH and have not provided this information. If we assume that all of the applicants with unknown gender are men, then the percentage of women is 33%. If we assume that all of these applicants are women, the percentage of women is 55%.


For the Pioneer program, the percentage of women awardees matches the percentage of women applicants. Based on the numbers, the p-value is 1.00, that is, there is no evidence that these distributions are different.

For the New Innovator program, the percentage of women awardees is slightly higher than the percentage of women in the applicant pool (32% versus 28%). The p-value is 0.72, indicating that the gender distribution of awardees is reasonably likely given the gender distribution of applicants.

For the Early Independence program, the percentage of women awardees is lower than the percentage in the applicant pool (19% versus 33-55%). Using the numbers for those with known gender, this mismatch has a p-value of 0.15. This is concerning as I will return to shortly.


For the Transformative R01 program, the percentage of women awardees is slightly higher than the percentage in the applicant pool (23% versus 21%). The p-value is 0.74.

Thus, for three of the programs, there is either no evidence of bias going from the applicant pool to the awardee pool. However, the percentages of women in the applicant pools are relatively low (21 to 28%). Particularly for the New Innovator program, the fact that only 28% of the applicants are women may reflect the pool of eligible faculty (although examining this will require additional data) or may reflect the likelihood that eligible women apply at the same frequency as do eligible men.


The most concerning data are for the Early Independence program. There is reasonably strong evidence for bias against women in moving from the applicant pool to the awardee pool (although knowledge of the magnitude of this effect is limited by the missing data for the applicant pool). There are at least two levels where this bias may be manifest. First, of course, is the review and selection process. But, one should keep in mind that this program requires considerable evidence of institutional support. Each institution is limited to two applicants and the application requires details about institutional support. Regardless of the sources, the NIH should examine this aspect of this program in short order to understand and try to correct any shortcomings of the process. This program has the potential to be particularly valuable for women since it is intended to shorten the time to independence, potentially better aligning the career path with biological clocks for those interested in having a family.

The data that I obtained allow one additional bit of analysis. The success rates for the program are as follows:

Pioneer:  13/208 = 6.3%

New Innovator:  41/497 = 8.2%

Early Independence:  16/80 = 20%

Transformative R01:  13/329 = 4.0%

While these success rates are low but the application processes, at least for the Pioneer and New Innovator awards, are relatively streamlined and, in my opinion, many additional scientists should consider applying to these programs. For the Early Independence program, the success rate is relatively high, but this reflects the limitation of two applicants per institution. This limitation presents another point of potential bias toward particular types of applicant.

The program are important in their own right and are flagships for NIH. It is essential that they be examine carefully to ensure as much as possible that they are serving their stated goals and are capturing the full range of outstanding scientific talent in the community.

11 responses so far

NIH "High Risk" Programs-Gender Issues-Part 1-Pioneer Program

(by datahound) Oct 12 2015

With the recent announcement of the "high-risk" research awards from the NIH, a discussion on Twitter began around the relatively low number of women awardees for these awards. I will return to this issue later but first I want to provide some background. For this post, I will focus on the first in this suite of what is now four programs.

The NIH Director's Pioneer Award (DP1) was initiated as part of the original NIH Roadmap for Medical Research. The award was intended to be an experiment driven by a frequent concern raised to Elias Zerhouni, then the NIH Director, that there were a number of highly innovative researchers in fields relevant to the NIH mission who were not applying to NIH because the relatively opacity and complexity of the NIH application process.

There were a large number of nominees (the program initially involved nominations rather than applications) and most of the nominations were submitted shortly before the deadline catching the NIH slightly off guard (more of this later). The Pioneer application involves a 5 page essay (rather than the more standard R01-type application of 25 pages at the time) and 22 of the most highly rated applicants are interviewed in person in Bethesda. This program was initiated just as I was starting my position as Director of NIGMS and I was not involved in the program in the first year.

When the first Pioneer awards were announced in September of 2004, I was surprised and disappointed by the outcome. There were nine recipients, several of whom were relatively well established within the NIH community including, for example, Steve McKnight (who was already well recognized within NIH as an innovative and productive scientist although he has gone on to make some controversial statements about the scientific community) and Homme Hellinga (who was recognized as a rising star at the time although much of his research has turned out to be, at best, irreproducible). I was hoping to go back to my office to google the awardees because I had not heard of them or did not know much about them. In addition, all nine of the awardees were male and this, appropriately, raised concerns within the scientific community both outside and inside NIH.

After the next meeting of the Institute and Center directors, I was sharing my views with Raynard Kington, then Deputy Director of NIH. He listened carefully and told me that Dr. Zerhouni needed to hear such concerns and I dutifully went back to my office and composed a long email. A couple of days later, I walked into a meeting at which both Drs. Kington and Zerhouni were present. They called me over and asked if I/NIGMS would like to take over running the Pioneer program. I was delighted if a bit daunted by this opportunity and asked some of my key colleagues including former acting-NIGMS Director Judith Greenberg if she would be willing to help with this effort.

We had a bit of time to review the processes that were used the first year and made a number of small changes including removing a "leadership potential" criterion that was used the first year since it seemed to peripheral to the goals of the program and had the potential to introduce biases of various sorts, allowing self-nominations and later applications, recruiting a more diverse pool of reviewers (more on this later), reaching out more aggressively through many outlets about the Pioneer program, reminding applicants and reviewers at all stages that "pioneering" researchers are quite diverse in all dimensions including gender, race and ethnicity, field, and career stage.

We again received a large number of applications and the process worked fairly smoothly. The end result was 13 awardees in a wide range of fields and career stages including 7 men and 6 women. As one would expect given access to $500K per year for 5 years as well as a competitive selection process, these investigators have done quite well, some exceptionally so.

The process continued for several more years with relatively similar results. After a total of five years were complete (so that we would have a reasonable data set), we initiated a process evaluation. This was completed and released in 2010. This is quite a thorough report and I encourage interested readers to have a look in its entirety.

With regard to gender distributions of Pioneer applicants, interviewees, and awardees, the key findings were:

The percentage of female applicants ranged from 22% to 27% with a mean of 25%. This number increased the year after we took over the program, a reassuring results after the results of the first year.

The percentage of female interviewees was 27% and the percentage of female awardees was 29%. The differences between these percentages and the applicant pool were not significantly significant.

The percentage of female awardees at 29% was higher than the percentage of female R01 awardees over the same period (23%).

One striking and distress result from the first year was the percentage of women among the reviewers. These results are shown below:

Pioneer Evaluators

While it is important to keep in mind that gender makeup of a review groups often does not eliminate or even reduce unconscious gender bias (example), the results from the first year of the Pioneer program were quite worrisome. The NIH staff running the program did not anticipate the number of nominees (1331) and had to scramble to recruit enough reviewers on short notice. With that constraint, the result was 59 men and 4 women including only 1 woman on the interview committee.

As an aside, the first years of the Pioneer program were run before existed. A special system had to be built and this allowed collection of data about exactly when applications were submitted. The results for the first year that NIGMS ran the program are shown below:

Pioneer timing

This shows the number of nominations/applications as a function of the data from the opening of the submission site (3/1) to the closing date (4/1). This reveals that many applicants submitted within the last few days before the due date. In addition, the eventual awardees (shown with red bars) tended to submit late in the day including a few minutes before the deadline. I would never have some much faith in a website.

I will discuss some of the other programs in subsequent posts. For now, I welcome thoughts about this analysis of the Pioneer program including gender balance issues. I have submitted a FOIA request for information about the applicant pools for these programs for the current year so that I hope to have data to do some analysis beyond looking at the awardees.

24 responses so far

Lasker to Nobel: Clinical versus Basic-Part 2

(by datahound) Oct 12 2015

In a recent post, I analyzed more than 50 years of Lasker Award winners with regard to the likelihood that a winner went on to receive a Nobel Prize. Six percent of the Lasker Clinical Science awardees had gone on to receive Nobel Prizes compared with 48% of the Laster Basic Science awardees.

Driven in part by a comment on this post, I looked at the degrees received by these Nobel laureates in the course of their training. Of the six Clinical Science awardees, 2 had MDs, 1 had an undergraduate medical degree (MB ChB), 2 had PhDs, and 1 had an engineering degree. As far as I can tell, none of the six were active in clinical practice.

Of the 58 Basic Science awardees, 22 had MDs (or MD-PhDs). Some of these MDs completed internships and residencies, but very few of them appear to have been active in clinical practice passed their training.

For Nobel prizes that went to multiple researchers, there do not appear to be few or no examples of awards to clinical sciences and basic scientists working together on a project.

No responses yet

Lasker to Nobel: Clinical versus Basic

(by datahound) Oct 07 2015

In response to my recent post regarding Nobel Prizes, a commenter suggested that comparing the number of winners of the Lasker Award for Clinical Medical Research who had gone on to receive a Nobel Prize with the number of winners of the Lasker Award for Basic Medical Research who had done so would be one way to quantity any bias toward basic science in the Nobel Prizes. This seemed like a reasonable suggestion so I tallied the numbers for these categories.

Since 1960, 100 individuals have won or shared the Lasker Award in Clinical Medical Research. Of these 6 have gone on to receive Nobel Prizes (including Youyou Tu this year). 121 individual have won or shared the Lasker Award in Basic Medical Research over the same period. Of these 58 have gone on to win a Nobel! Needless to say, this is statistically significant with p < .00000000001.

This clearly reflects a bias favoring basic research although there are some additional factors affecting these numbers such as that approximately a third of the Lasker Basic Medical Research awardees have gone on to win Nobels in Chemistry rather than Physiology or Medicine. There may be more tomorrow morning.

3 responses so far

Nobel Prize Week

(by datahound) Oct 02 2015

The Nobel Prizes will be announced next week starting with the Physiology or Medicine Prize on Monday morning. Nobel week took on substantial significance to me during my time as Director of NIGMS. NIGMS had hired a talented science writer in the Communications Office who pointed out that Nobel week was actually quite stressful for science writers. They wake up early in the morning, find out who won the prize, and then have to have a story ready for their editor in short order that might actually be on the front page without knowing the science or having sources available to talk in the early morning. At her suggestion, we started reaching out to reporters prior to the announcements based on the fact that many Nobel Prize winners had been supported by NIH (and, in particular, NIGMS, more on that later) in the past. Every year that I was Director, either the Physiology or Medicine Prize or the Chemistry Prize was awarded to one or more NIGMS grantees.

The outreach strategy was quite successful. I woke up, looked at the Nobel announcement, and then started fielding emails and phone calls from reporters. One of my favorite moments was in 2006 when I was asked by a New York Times reporter why the US tended to do so well with Nobel Prizes. I remember thinking that if I blew this one, I would probably not have a job by the end of the day. I commented that it reflected taxpayer support for science through agencies such as NIH and NSF. This was paraphrased in the article:

Dr. Jeremy M. Berg, director of the National Institute of General Medical Sciences at theNational Institutes of Health, said the honor showed the importance of taxpayer-supported basic research that is not aimed at a certain goal. His institute has provided financial support for Dr. Kornberg’s work since 1979, even when it was unclear if the research would be successful, he said.

Now for some data. Over its history, NIH has supported a total of 145 Nobel laureates. Since its founding in 1962, NIGMS has supported 81 Nobel laureates. This represents 56% of the Nobel Prize winners with approximately 8% of the overall budget. NCI appears to have supported the second most with 29 laureates. Bear in mind that many Nobel laureates have been supported by more that one institute.

While I was President of ASBMB, I wrote a column on the role of serendipity in important discoveries. I examined all of the Nobel Prize winners in Physiology or Medicine, or Chemistry for a 25 year period and subjectively scored the discoveries as serendipity (such as the discovery of RNA interference), driven by problem selection (such as the determination of the structure of RNA polymerase), or a hybrid of these two factors. Overall, I concluded that, of 117 laureates, 14 made serendipitous discoveries, 72 won based on choosing a clearly fundamental problem, and 31 were hybrids.

I have my predictions made for the prizes for this year. I am usually wrong, but have gotten a couple right over the years. One of my long-running favorites, Carl Djerassi, died in February having never been awarded a Nobel despite leading the invention of the birth-control pill and many other fundamental contributions to chemistry. We'll find out what happens next week.

8 responses so far

The Demise of 38,000 NIH-funded Investigators

(by datahound) Aug 19 2015

In my first post using R to analyze NIH data, I examined the number of unique investigators funded by NIH per year as a function of time. The definition of "unique PIs" was based on the number of unique "Contact PI Person ID" numbers in the NIH RePORT database from 1985 to 2014. Overall, this number was 216,521.

As I prepared my data set for more analysis, I discovered that some investigators had more than one Contact PI Person ID number. I have spent the past 2 months trying to sort this out and I am still not done. An investigator in the intramural program has well over 1oo ID numbers over time! Getting this sorted out is crucial for future analyses, particularly longitudinal ones that are so important. Otherwise, an investigator might appear to have a gap or termination in funding just because their ID number changed.

In addition, there are problems the other direction with multiple names associated with one ID number. A very small number of these appear to be cases where different people have been assigned to the same ID number. Most are related to non-uniformity in how names are entered (e.g. with or without a middle initial, with or without a period on the middle initial). Some are good to have been captured such as PI name changes associated with changes in marital status.

At this point, I am down to 178,122 unique ID numbers and I expect this number to fall further. While this has been a great exercise in learning R as well as examining creative practices in data entry (I did not previously know that NMN would entered in some cases where an individual gives No Middle Name), I am ready to finish up this stage and get on with more interesting analyses. But, with "data science" as with other types of science, time spent checking the validity of raw data before other analyses are done is time well spent.

15 responses so far

First Outstanding Investigator (R35) Awards from NCI

(by datahound) Aug 12 2015

The R35 mechanism is emerging at NIH as a mechanism for providing more stable (i.e. longer-term and for research programs rather than projects) support for selected investigators. The first R35 program out of the box was the NCI Outstanding Investigator Award, followed by the NIGMS MIRA Award. NINDS has also recently announced an outstanding program as well.

The first 17 R35 awards from NCI appeared in NIH RePORTER recently. These investigators cover the NCI mission fairly well (biology, genomics, surveillance, prevention including behavior, treatment). These investigators also have a wide range of funding with core support for FY2014 ranging from $230 K annual total costs to $5.8 M with a median of approximately $700 K total costs (although these values are somewhat subject to judgment since considerable support comes from P30 Cancer Center grants and program project grants (P01s)). I tried to provide lower estimates. The investigators are relatively diverse with regard to age with estimated ages ranging from 41 to 74 with an estimated median age of 56. The initial group includes 13 men and 4 women.

More awards are appearing in RePORTER; 4 additional awards have appeared since I did this initial analysis so expect updates.

7 responses so far

Percentages of Faculty Salary Support at Academic Medical Centers

(by datahound) Jun 17 2015

There has been much discussion of the percentages of faculty salaries coming from internal versus external sources. In the context of helping prepare a recent paper from leaders of academic medical centers, I was able to obtain some data from the AAMC (American Association of Medical Colleges) regarding the distribution of levels of extramural support across 72 academic medical centers for 2013. These data are shown below:

STL Figure


These data were collected under terms of strict anonymity for institutions. Furthermore, as noted in the caption, they were collected by obtaining the total amount of extramural support going to faculty salaries and dividing by the total amount going to salaries for individuals with at least some extramural support. Thus, distributions of levels of support across a given institution are not available. Nonetheless, these distributions provide some sense of the range of individual institutional behavior that is more informative than an overall median with no other information.

26 responses so far

IC Distributions for R01s from PIs with Multiple R01s

(by datahound) Jun 08 2015

In my previous post, I examined the fraction of NIH PIs who had either a single R01 (or R37 Merit Award) or multiple R01s for fiscal year 2014. Overall, about 30% of R01 PIs had more than 1 R01. In the comments and on Twitter, the issue came up about whether those with multiple R01s had them from the same IC or from multiple institutes.

To address this question, I asked the question: If an PI had an R01 from one institute, what is the distribution of ICs for the additional R01s going to the same PI. The results are tabulated below:



Overall, the percentage of those additional R01s coming from the same IC ranges from 47 to 75%. For those that do not come from the same IC, the number of ICs contributing substantially ranges from a few to many illustrated below (which depicts the data above displayed as the fraction of the R01s from the different ICs given an R01 from a particular IC).

Mult PI IC Graph

For example, if a PI has one Ro1 from AA (NIAAA), 61% of additional R01s come from AA and 18% come from DA (NIDA), leaving 21% for the remaining ICs. In contrast, if a PI has a grant from GM (NIGMS) or CA (NIH), it takes 4 additional ICs to reach 18% of additional R01s.

Which ICs are linked by having PIs with multiple R01s? I examined the top two contributions of additional R01s for each IC (in addition to the IC itself). In these "top two lists", I joined the pairs of ICs. I used a bold line if the link was bi-direcctional, that is, each PI appeared on the top two list of the other. The results are depicted below:

IC-IC graph-2-rev


Overall, the patterns that emerge are as might be anticipated. The bidirectional links are between AA-DA, MH-NS, DK-HL, CA-GM, and CA-AI. Some of the larger ICs are linked to many other ICs, reflecting both their size and their relatively broad missions.


As noted in the comments, some of these connections could be attributed to the size of the ICs. Thus, NCI appeared to be linked to many other ICs, but this could be due to the large number of R01s awarded by NCI rather than by actual content overlap.

To address this, I simulated results assuming that the probabilities for an additional grant coming from a particular IC was proportional to the number of grants that this IC award in this data set. I then compared the simulated results with the actual results. Of course, the number of grants going to the same IC was much higher than would be expected. Since this distorted the other statistics, I set all of these values equal to 0 and re-simulated the data. I (or, more correctly, R) performed 1000 simulations and then calculated mean, standard deviation, and other statistics for these distributions of grant numbers. I then compared these with the actual values observed in the data. The results (presented a log(base 10) of the probability of occurring by chance are presented below:



These results allow assessment of the strength of the interactions corrected for IC size.

The strongest interactions are between NIDA and NIAAA with probabilities of occurring by change of < 10^-88.

The other strong interactions are:


NIAMS and NIDCR (which was still detected previously even though these are both relatively small ICs)




NIDCD and NEI (which was not detected previously)

The link between NCI and NIGMS is still the strongest link between NCI and another IC, but it is substantially less pronounced that the other links above.

Thanks for the comments. I think this a much improved analysis and I had an excuse to explore additional R tools.

I am now working on generating a 2-dimensional figure that is more consistent with these connectivities in a more formal way.

18 responses so far

Single vs Multiple R01 Holders by IC

(by datahound) Jun 05 2015

On a recent Drugmonkey post on the new NIGMS MIRA Award announcement, a commenter suggested that NIGMS might have more R01 PIs with more than 1 R01. With my new R tools, it was relatively straightforward to check this.

Below is a table with the number of PIs (not counting multiple PIs in this analysis) from each IC who have 1 R01 or more than 1 R01 for fiscal year 2014 (R37s are also included). The abbreviations for the ICs are shown with the IC number. Note that the additional R01s can be from the same or a different IC.



As can be seen, NIGMS (GM) is actually slightly below the median (not weighted by the number of PIs) of 0.304 and below all of the other large ICs (CA, AI, HL).

Other queries welcome!

Updated:  I discovered an error in the table that I originally posted. A revised table is included. None of the conclusions were affected.

15 responses so far

« Newer posts Older posts »