In my previous post, I examined the fraction of NIH PIs who had either a single R01 (or R37 Merit Award) or multiple R01s for fiscal year 2014. Overall, about 30% of R01 PIs had more than 1 R01. In the comments and on Twitter, the issue came up about whether those with multiple R01s had them from the same IC or from multiple institutes.
To address this question, I asked the question: If an PI had an R01 from one institute, what is the distribution of ICs for the additional R01s going to the same PI. The results are tabulated below:
NIH Abbreviation Key: AA=NIAAA, AG =NIA, AI = NIAID, AR = NIAMS, AT=NCCAM, CA=NCI, DA=NIDA, DC=NIDCD, DE=NIDCR, DK=NIDDK, EB=NIBIB, ES=NIEHS, EY=NEI, GM=NIGMS, HD=NICHD, HG=NHGRI, HL=NHLBI, LM=NLM, MD=NIMHD, MH=NIMH, NR=NINR, NS=NINDS
Overall, the percentage of those additional R01s coming from the same IC ranges from 47 to 75%. For those that do not come from the same IC, the number of ICs contributing substantially ranges from a few to many illustrated below (which depicts the data above displayed as the fraction of the R01s from the different ICs given an R01 from a particular IC).
For example, if a PI has one Ro1 from AA (NIAAA), 61% of additional R01s come from AA and 18% come from DA (NIDA), leaving 21% for the remaining ICs. In contrast, if a PI has a grant from GM (NIGMS) or CA (NIH), it takes 4 additional ICs to reach 18% of additional R01s.
Which ICs are linked by having PIs with multiple R01s? I examined the top two contributions of additional R01s for each IC (in addition to the IC itself). In these "top two lists", I joined the pairs of ICs. I used a bold line if the link was bi-direcctional, that is, each PI appeared on the top two list of the other. The results are depicted below:
Overall, the patterns that emerge are as might be anticipated. The bidirectional links are between AA-DA, MH-NS, DK-HL, CA-GM, and CA-AI. Some of the larger ICs are linked to many other ICs, reflecting both their size and their relatively broad missions.
As noted in the comments, some of these connections could be attributed to the size of the ICs. Thus, NCI appeared to be linked to many other ICs, but this could be due to the large number of R01s awarded by NCI rather than by actual content overlap.
To address this, I simulated results assuming that the probabilities for an additional grant coming from a particular IC was proportional to the number of grants that this IC award in this data set. I then compared the simulated results with the actual results. Of course, the number of grants going to the same IC was much higher than would be expected. Since this distorted the other statistics, I set all of these values equal to 0 and re-simulated the data. I (or, more correctly, R) performed 1000 simulations and then calculated mean, standard deviation, and other statistics for these distributions of grant numbers. I then compared these with the actual values observed in the data. The results (presented a log(base 10) of the probability of occurring by chance are presented below:
These results allow assessment of the strength of the interactions corrected for IC size.
The strongest interactions are between NIDA and NIAAA with probabilities of occurring by change of < 10^-88.
The other strong interactions are:
NIMH and NINDS
NIAMS and NIDCR (which was still detected previously even though these are both relatively small ICs)
NIDA and NIMH
NIDDK and NHLBI
NIGMS and NIAID
NIDCD and NEI (which was not detected previously)
The link between NCI and NIGMS is still the strongest link between NCI and another IC, but it is substantially less pronounced that the other links above.
Thanks for the comments. I think this a much improved analysis and I had an excuse to explore additional R tools.
I am now working on generating a 2-dimensional figure that is more consistent with these connectivities in a more formal way.