I recently posted analysis regarding the gender distribution between the awardees and the applicant pools for the four NIH High Risk Research (Pioneer, New Innovator, Early Independence, and Transformative R01) programs for 2015. Data regarding the awardees can be gleaned from NIH RePORTER or the NIH Common Fund website. However, data regarding the applicants are not available except through Freedom on Information Act (FOIA) requests. This is how I obtained the data for the 2015 applicants. I subsequently submitted a request for data regarding the applicants pools for all years since each award program was created. While I intended to get data for each year separately, I was not sufficiently clear in my request and I instead received aggregate data over all years. I have clarified my request and am awaiting a response. In the meantime, I want to share the analysis of the aggregate data.
Here are the data:
Program Applicants (Male) Applicants (Female) Applicants
. (Gender not given)
Pioneer 2051 (75.4%) 666 (24.6%) 53
New Innovator 3498 (67.2%) 1706 (32.8%) 120
Early Independence 233 (59.9%) 156 (40.1%) 62
Transformative R01 3245 (78.4%) 894 (21.6%) 160
A glance at these data reveals a couple of points. First, the gender balance of the applicant pool follows a pattern that could be anticipated based on the career stage mix with the closest balance for the Early Independence ("Skip the postdoc") program, followed by the New Innovator Award program, the Pioneer program, and, lastly, the Transformative R01 program. Second, while the number of investigators with genders withheld or unknown is less that 4% for three of the programs, it is 13.7% for the Early Independence program. This introduces some uncertainly in the analysis as will be discussed later.
How does these results compare with the Awardee pools?
Program Awardees (Male) Awardees (Female)
Pioneer 110 (71.0%) 45 (29.0%)
New Innovator 263 (66.0%) 135 (34.0%)
Early Independence 54 (75.0%) 18 (25.0%)
Transformative R01 164 (83.2%) 33 (16.8%)
Let's compare the applicant pool with the awardee pool program by program. For the Pioneer program, women correspond to 24.6% of the applicant pool and 29.0% of the awardee pool. This difference is not statistically significant with a p value 0f 0.18.
For the New Innovator program, women correspond to 32.8% of the applicant pool and 34.0% of the awardee pool. This small difference is not statistically significant with a p value of 0.62.
For the Early Independence program, women make up 40.1% of the applicant pool but only 25% of the awardee pool. This difference is statistically significant with a p value of 0.0049. Recall that 62 of the Early Independence applicants had genders that were unknown or withheld. If we assume that these applicants were split with the same proportions as the remainder of the pool (60% male, 40% female), the p value is decreased slightly to 0.0040. If we assume that the applicants of unknown gender were 50% male, 50% female, the p value is increased t0 0.0090. Only if we assume that all or nearly all of the applicants of unknown gender were male (and there is not reason to think that this is true), does the p value go above 0.05 (p value = 0.078 if all are assumed male). Thus, it appears very likely that there is a statistically significant decrease in the proportion of women in the Early Independence awardee pool compared with the applicant pool. This observation supports my proposal that NIH should investigate this program carefully to try to understand the source(s) of this disparity.
For the Transformative R01, women make up 21.6% of the applicant pool and 16.8% of the awardee pool. This difference is not statistically significant with a p value of 0.092. However, it is noteworthy that both the applicant pool and the awardee pool are strongly dominated by male principal investigators. This may reflect the relative seniority of applicants to this program or other factors about the program structure or management. This will require more analysis.
These results are summarized in the figure below:
I await the data broken down by year to see if there are meaningful trends over time. In the meanwhile, these data do provide the scientific community and the NIH some food for thought.