One of the most quoted statistics about the NIH is that the average age of an investigator receiving their first R01 is approximately 42. The increasing age of "New Investigators" has been the cause of considerable concern across NIH and the scientific community. When I was at NIH, many realized that the definition of "New Investigator" as someone who had not previously received substantial NIH funding led to a quite heterogeneous group. New Investigators, who many imagined as scientists in the early stages of their careers also included senior scientists who came from other fields (where their research support had come from NSF and other non-NIH agencies) or from other situations (such as from other countries) where their research had been supported by other agencies. NIH did some internal analysis that revealed that approximately half of the "new investigators" were in the early stages of their careers while the other half were more senior. This led to the definition of an "early stage investigator" or ESI as someone who was within 10 years of their terminal degree or the end of their clinical training.
As my readers have likely discovered, I feel that single statistics such as "an average age of 42" is dangerous to interpret without looking at the data and distributions that underlie such figures. After searching online and asking several sources at NIH if data about the age distributions were publicly available without success, I filed a request to the NIH through the Freedom of Information Act (FOIA) on October 12 requesting available age data for New Investigators and ESIs for fiscal years 2006 through the present. Today, I received an email with an attached spreadsheet responsive to my request.
The spreadsheet contains New Investigator data from FY2006 through FY2013 and ESI data from FY2009 through FY2013. The data has a few limitations. First, the age distributions below 31 and below and 55 and above are binned. Second, cells that would contain fewer than 10 are left blank. While this is to protect the anonymity of individuals who would fall in these cells, I am not sure how this applies here. Third, age data are not available for approximately 8-10% of these grantees.
The age distributions for New Investigators are shown below:
This plot shows relatively little change from Fy2006 to FY2013. The medians calculated from these data are shown below:
These values support the conclusion that little change has occurred.
The age distributions for Early Stage Investigators are shown below:
These distributions also show relatively little change. The distributions are approximately Gaussian with relatively skew, suggesting that the limit of being within 10 years of the terminal degree is not having a dramatic effort.
The medians for these distributions are shown below:
Again, relatively little change has occurred; if anything, the median age appears to have increased slightly over this period.
Finally, the availability of both distributions allows the calculation of the distributions for non-ESI New Investigators. The missing ESI data (due to cells with fewer than 10 grantees) were estimated by fitting Gaussians to the distributions. The curves for ESI and non-ESI New Investigators are compared below:
The curves for non-ESI New Investigators are skewed with one, relatively steep, arm with a halfway point slightly above age 40 and the other more gradual arm with a halfway point near 50. The medians for the non-ESI New Investigators are shown below:
The medians here appear to have move up approximately 1 year over this period.
One final parameter of interest is the percentage of New Investigators who are ESIs. NIH had discussed trying to substantial increase this percentage over time. These percentages are shown below:
The percentage does appear to have increase to some extent over this period although the increase is relatively modest.
I welcome your thoughts about these data and what they might suggest in terms of the success of current or potential new NIH policies.