K99-R00 Publication Analysis-Part 7-High Profile Publications

(by datahound) Oct 28 2014

In my first post in this series, I noted that approximately 20% of the 2007 cohort of K99 awardees had a "very high profile" (Science, Nature, Cell, New England Journal of Medicine, or JAMA) publication and 40% had some other "high profile" (PNAS, other Cell or Nature journal, Journal of Clinical Investigation) publication during or prior to 2007. Following comments, I noted in a later post that the distribution was not uniform across investigators supported by the various NIH institutes and centers.

Here, I examine such "high profile" publications both before and after the K99 awards. The distribution of the number of "high profile" publications across a pool of 132 K99-R00 investigators who received their K99 awards in FY2007 is shown below:

Glamor pub histogram-2

The average number of Science, Nature, Cell, NEJM, or JAMA papers is 0.8 per investigator with a range of 0 to 9. More than 60% of these investigators have not published in Science, Nature, Cell, NEJM, or JAMA, either before or after receiving the K99 award. The average number of other "high profile" publications is 2.6 per investigator with a range of 0 to 26.

To depict the relationship between the number of high profile publication before and after receiving the K99 award, the number of all "high profile" publications after 2008 is plotted versus the number of all "high profile" publications in 2008 or before is plotted below:

Correlation glamor

The overall correlation coefficient is 0.59. Note that several investigators who had not published any "high profile" publications prior to receiving the K99 award have published such papers subsequently and many of these were published during the R00 phase or subsequently.

Updated:  In response to comments on Twitter, I looked as last author publications. Of the 47 Science, Nature, Cell, NEJM, JAMA papers published after 2008, 17 had the K99 awardee as last author. Of the 174 other "high profile" publications published after 2008, 70 had the K99 awardee as last author.

Below is a plot of the number of "high profile" last author publications versus the number of "high profile" first author publications. The numbers on the graph correspond to the numbers of investigators represented by each data point.

Glamor-First-Last plot

The correlation coefficient is 0.47.

No responses yet

K99-R00 Publication Analysis-Part 6-First and Last Authorships

(by datahound) Oct 27 2014

The K99-R00 program in an interesting one from the point of view of evaluation since this program has cohorts of investigators who begin at approximately the same career stage and then progress through several transitions into independent careers. These different career stages should be reflected in different positions in the author list for publications. Below is a plot of the total number of publications, the number of first author publications, the number of last author publications, and the number of sole author publications per investigator per year for 132 investigators who received K99 awards in FY2007 and went on to receive an R00 award. No attempt was made to correct for joint first authors or for joint corresponding authors.

All Pubs-First,Last-plot-Large

As anticipated, this plot reveals that, prior to receiving the K99 award, these investigators (on average) publish an approximately equal mixture of first author and middle author publications with very few last author publications. A year after the receipt of the K99 award, the number of first author publications per year begins to drop while the number of last author publications grows. This increase in last author publications is associated with an increase in the average number of total publications per year. Note that the dotted lines leading to 2014 results reflects that fact that the data do not include all publications for calendar year 2014.

These data reflect the full cohort of 2007 K99 awardees for which reliable publication data were compiled without regard to the year that each investigator transitioned to the R00 phase. The data separated according to R00 groups, with 50 investigators who received R00 awards in FY2007 or FY2008, 63 investigators who received R00 awards in FY2009, and 19 investigators who received R00 awards in FY2010 or later, are shown in the plot below:

Pubs-R00 year-First,Last-3

Again, as anticipated, the transition from first author publications to last author publications occurs earliest for the 2007,8 R00 cohort, later for the 2009 R00 cohort, and latest for the 2010,11 R00 cohort.

In addition, the first plot includes both investigators who have been successful in obtain R01s or similar awards from NIH as well as those who have not. These results for these two groups are shown below:

Pubs-R01-NoR01 plot

The number of first author publications appears to be slightly higher  for those who have received R01s than for those who have not. The number of  last author and other publications is higher for those who have received R01s after receiving the K99 awards, but this likely largely a consequence of the science performed with the supports of the R01 support.

11 responses so far

K99-R00 Publication Analysis-Part 4-Revisited

(by datahound) Oct 24 2014

Spurred, in part, by Drugmonkey's post, I have been thinking further about the analysis I did relating the number of publications to the number of authors per publication. I realize that I did not fully grasp the implications of my results. A key question is whether an increased number of publications per investigator can be accounted for by an increased number of authors per publication.

Two limiting cases can be considered. In the first case, the average number of authors per publication would be essentially constant, regardless of the number of publications by a given investigator. This case is ruled out by the data presented that show that the average number of authors is positively correlated with the number of publications with a correlation coefficient of 0.47.

In the second case, the average number of authors per publication increases directly with the number with the number of publications. In this case, the plot of the number of publications weighted by 1/the number of authors versus the number of publications (the plot highlighted by Drugmonkey) would be a line with slope 0. However, the trend line in the plot has a substantial positive slope. Previously, I focused on the fact that the correlation coefficient for this plot is large (0.83). However, this is not really the point. It is not surprising that the weighted number of publications is relatively well correlated with the number of publications. It is the slope of the line that conveys the information.

Simulations suggest that the slope line in this plot is close to what one would expect if the number of authors per publication were constant. Another way to see the same point is to consider the average number of authors for investigators with the smallest and largest number of publications.

20 investigators with the fewest publications:  Average number of publications-9.1, Average number of authors per publication-4.6

20 investigators with the most publications:  Average number of publications-57.4-Average number of authors per publication-7.3

Thus, while the number of publications increases by a factor of 57.4/9.1 = 6.3, the average number of authors per publication increased by a factor of 7.5/4.6 = 1.6.

These data do not support the notion that the increased number of publications is due primarily to an increased number of authors per publication.

As I posted subsequently, a major factor contributing to the number of publication is the amount of support that each investigator has been able to garner.

5 responses so far

K99-R00 Publication Analysis-Part 5-NIH Funding

(by datahound) Oct 24 2014

In analyzing the publication patterns of K99-R00 awardees, one crucial factor is the amount of funding that each investigator has obtained to support his/her research. Each K99-Roo investigator likely has received a comparable level of support through the K99-R00 mechanism itself. However, different investigators have been more or less successful in garnering other NIH support such as R01s or other grants. This information can be relatively easily gathered through NIH RePORTER. This is only part of the funding story as each investigator likely received start-up funds as part of their transition from postdoc to independent faculty member. Furthermore, some investigators likely have been able to obtain other sources of funding such as NSF and private foundations. With those disclaimers, below is a plot of the number of publications versus the total amount of NIH support obtained beginning with the K99 award in 2007 and extending to the present (not including funding from grants with co-PIs which affects a few investigators).

Pubs-Funding plot


Sharp-eyed readers may note the absence of the investigator with the most (95) publications from my initial post on this subject. This investigator only received funding through the R00 phase before moving to a position in Europe so I removed this data point from the plot. The correlation coefficient for this parameters is 0.30. The correlation improves slightly t0 0.32 if each publication is weighted by 1/number of authors as described in the previous post.

This plot includes all publications for each investigators. However, publications prior to receiving the K99 award were not supported by these funds. Below is a plot of publications from 2008-present versus total NIH support.

Pubs-2008 vs Funding Plot

Restricting the results to these publications results in the correlation coefficient increasing slightly to 0.33.

These data support the not-at-all-surprising conclusion that the total number of publications is roughly associated with the amount of financial support for the research. The lack of better correlation likely involves a combination of factors including the absence of information about start-up funds and non-NIH support, the different costs of research in different fields, and differences in publication styles between investigators.

One response so far

K99-R00 Publication Analysis-Part 4-Numbers of Authors

(by datahound) Oct 23 2014

In my first K99-R00 publication analysis post, I presented the distribution of the number of publications for the FY2007 K99-R00 cohort. In this post, I examine the distributions of the numbers of authors per paper.

For 134 K99 investigators for which I have been able to identify publications relatively unambiguously through PubMed and who transitioned to a R00 award, a histogram of the number of authors per publication is shown below:

Publication histogram


The median number of authors is 5 and the mean is 6.5. Of course, average number of authors per publication varies from investigator, reflecting different circumstances, areas of science, and other factors. The distribution is shown below:

Average Authors per Paper Histogram

How does the average number of authors per publication relate to the number of publications?

No Authors vs No Publications


As this plot shows, these two parameters are correlated with a correlation coefficient of 0.47. Investigators with a larger number of publications tend to have more authors per publication.

One way to correct for the influence of an increased number of authors on the number of publications is to weight each publication by 1/(number of authors) (as was suggested by a comment on Twitter). In this scenario, a paper with two authors would be worth 1/2 while a paper with 10 authors would be worth 1/10.

This weighted sum of publication is plotted versus the total number of publications below:

1 over n versus No Pubs

This tightens the distribution substantially and the correlation coefficient increases to 0.83. One interpretation of this is that differences in authorship practices do influence the number of publication for a substantial part of the distribution. The investigators on the tail of the distribution with a large number of publications still lie on the edge of the distribution with this correction. In general, these investigators have been successful in obtaining substantial resources subsequent to their R00 awards, leading to an increased number of publications.

7 responses so far

K99-R00 Publication Analysis-Part 3-IC Investments

(by datahound) Oct 22 2014

After my previous post, Drugmonkey commented on the number of K99 awards expected from each IC based on its share of the overall NIH budget. I examined the number of new K99 awards for each IC for fiscal years 2007-2013. There have been only relatively modest variation from year to year for each IC. Moreover, the number of K99 awards for each IC is approximately proportional to its share of the overall NIH budget with a few notable exceptions. This is shown below:

K99 Obs vs Expected


This plot shows the expected number of awards based on a total of 190 awards across NIH versus the average number of awards per year for each IC. The number of awards for most ICs is relatively close to that expected. The biggest outlier by far is NIAID where 30 awards per year are expected, but the actual average number of awards is approximately 6. For NCI, the numbers of awards per year expected is 33 and the average is 27. In the other direction, the number of awards expected is 20 but the average number of awards is 29.

16 responses so far

K99-R00 Publication Analysis-Part 2-IC Distributions

(by datahound) Oct 21 2014

In response to comments on my recent post, I have examined the IC distribution of both numbers of publications and "high profile" publications prior to receiving the K99 award as a function of the funding institute or center. Recall that my analysis included only those K99 awardees who went on to receive an R00 award and for whom the investigator's name was unambiguous enough to allow relatively reliable retrieval of publications from PubMed. For several investigators, name changes had occurred over the period examined and these were accounted for where possible through web searches. The analysis reflects 135 K99 awardees out of the total of 182 K99 awardees for FY2007.

The numbers of publications by each investigator (black dots) organized by the funding IC along with the median for each IC (red bars) are shown below:

Pubs by IC

No dramatic trends are observed although, given that the number of K99 awardees per IC ranges from 1 to 20 with typical numbers less than 10, the sample sizes are too small to support any robust conclusions.

The numbers of K99 awardees with very high profile (Cell, Nature, Science, or NEJM) or high profile publications for each IC are tabulated below:

IC Total C, N, S, NEJM Other high profile pub
NCI 13 4 9
NIAID 7 3 3
NHLBI 20 2 8
NIGMS 11 3 4
NIDDK 9 2 3
NINDS 9 3 4
NIMH 7 2 1
NICHD 5 0 1
NCRR 4 2 0
NIA 6 3 2
NIDA 4 1 1
NIEHS 5 0 0
NEI 4 0 0
NIAMS 5 1 3
NHGRI 3 1 2
NIAAA 3 1 0
NIDCD 6 1 2
NIDCR 4 0 0
NLM 2 0 0
NIBIB 1 0 1
NINR 4 0 1
NCCAM 1 0 0
FIC 2 0 1

Here, more striking trends are apparent with all 13 analyzed awardees from NCI having a high profile publication in or prior to 2007. Similar results are observed for other large ICs (e.g. NIAID, 6/7; NHLBI, 10/20; NIGMS, 7/11). This supports the notion that a record of one or more high profile publication was very important for receiving a K99 award from some ICs or in some fields.

8 responses so far

K99-R00 Publication Analysis-Part 1

(by datahound) Oct 19 2014

The NIH K99-R00 program is an important program related to the transition from postdoc to faculty positions. This program also presents an unusual opportunity for evaluation since cohorts of scientists at similar career stages compete for initial K99 awards and then can transition to R00 awards and then to R01s and other awards. I have previously posted analysis including the transitions to R00 and R01 grants, gender disparity in R01 transition probabilities, differences between NIH institutes and centers, and gender differences between R0o institutions.

I am now starting to analyze the publication patterns of K99-R00 awardees. For this study, I examined the initial 2007 K99 cohort of 182 investigators, of whom 170 transitioned to R00 awards. I examined the publications of these investigators through the Advanced Search function of PubMed. In many cases, this appeared to produce a relatively comprehensive list of publications based on comparisons with websites and other sources. In other cases, there results appeared problematic due to issues of name ambiguity or a significant number of publications that do not appear in PubMed. Publication lists through the present were generated for 135 investigators.

The total number of publications for each investigator is shown below:

Total Pub Distributions

The number of publications ranges from less than 10 to nearly 100. In some cases for investigators with a relatively small number of publications, technical issues may have resulted in undercounting publications while in a few other cases, the investigators appear to have left academia sometime after receiving the R00 award. Of course, publication numbers have considerable limitations and no attempt has been made at this point to examine individual publications in terms of the citations or other measures.

These publications can be broken down roughly into those leading up to the K99 award and those that occurred after receiving this award. While doing this relatively precisely would require going though individual publications, I used the number of publications in 2007 or before as a surrogate:


The publications after 2007 (2008-2014) are shown below:

Post 2007 pubs

These correspond to publications produced during the K99 award, during the R00 award, subsequent publications, as well as some publications of results generated prior to the K99 award that were somewhat slow to be published.

The correlation between the number of publications 2007 and before and the number of post-2007 publication is shown below:

Pre-Post Correlation

Not surprisingly, these are relatively strongly correlated with a correlation coefficient of approximately 0.6. Of course, this reflects differences in the publication patterns between fields and other factors in addition to some more calibrated measure of investigator productivity.

One additional factor that I have examined involves the meme that a publication in Science, Nature, or Cell is highly correlated with receiving a K99 award. Examination of the publication lists reveals that approximately 20% of the K99 awardees have a publication in Science, Nature, Cell or New England Journal of Medicine prior to or in 2007. In addition, approximately 40% have a publication in other relatively high profile journals such as PNAS, other Nature or Cell journals, and the Journal of Clinical Investigation.

With this list of nearly 4000 publications along with the other data that I have assembled on this cohort of investigators, much more analysis is possible and I welcome thoughts about what might be interesting.

23 responses so far

Perspectives on the NPR NIH Stories

(by datahound) Sep 26 2014

Recently, NPR (through the work of Richard Harris and colleagues) aired a series of 7 stories about biomedical research and NIH funding with 5 stories on Morning Edition (Tuesday, Wednesday, Monday, Tuesday2, Wednesday2) and 2 on All Things Considered (TuesdayTuesday2).

The first set of stories on Tuesday, September 9th, focused on the "boom and bust" funding environment beginning with the budget "doubling" followed by the past decade with its associated loss of buying power and on profiles of a couple of scientists who had moved on to non-scientific careers. These were followed by stories about over-building of research space, non-reducibility of animal studies ascribed to hyper-competitiveness, the mismatch between the number of trainees and the number of academic jobs, alternative models for setting research agendas with the National Breast Cancer Coalition as an example, and concluded with a discussion with former NIH Director and current NCI Director Harold Varmus about some potential adjustments to the system.

There has been an active set of discussions about these stories and related topics over at Drugmonkey (here, here, here, and here).

I learned that at least one story about the NIH was in the work when Richard Harris emailed me to initiate a discussion about these issues back in April. This was just prior to the panel discussion at the Experimental Biology meeting that I had been planning with the ASBMB Public Affairs Advisory Committee on related topics. I sent Richard our white paper on Building a More Sustainable Biomedical Enterprise as well as my recent ASBMB Today column about the impact of the sequester on the number of R funded investigators. Over the course of our discussions, I helped Richard and his colleagues about the use of NIH Reporter, both to confirm statistics but, more importantly, to compile a list of investigators who has recently lost funding to identify potential subjects for stories about the impact of the sequester and the disequilibrium of the biomedical research enterprise.

Two points.  First, this highlights a key challenge of journalism. Stories that focus on statistics (e.g. 1000 investigators lost R funding due to the sequester) tend to be rather sterile and not compelling in the public (as opposed to the scientific) sphere. Thus, he was seeking specific people to approach to find some who would go on the record about their experiences and the impact of the funding situation on their career situations. Of course, each specific example has its own idiosyncrasies and it is very difficult to find a few "typical" cases that approximately capture the full reality of what is going on. For example, the scientists who had left academic positions to start a business to produce liquor or to run a grocery struck some (including me) as odd examples given that they were more familiar with those leaving academia (and research) to move into communications or other "more traditional" science career alternatives.

In any event, I feel it is important to recognize the journalistic challenge of finding real human examples to make a story three-dimensional and compelling to the public. We should be appreciative of reporters who make the effort and of individuals who are willing to share their own stories so publicly.

Second, I was struck by the differences between reporting and advocacy. The story about how animal model studies relevant to ALS research turned out to be not very robust does not paint a flattering picture of some aspects of the biomedical research enterprise. In a short piece, it is difficult to explore all of the factors contributed (or might have contributed) to such outcomes so that the piece might come across as unfair. Nonetheless, in my opinion, it is very important to understand how the public perceives these issues (again, as discussed at Drugmonkey here and here) and having them aired in public, while uncomfortable, certainly has an upside.

My bottom line is that the scientific community needs to capitalize on the public awareness that comes from such press coverage. We need to learn from the stories and the public reactions to them, work to address the issues that we can tackle, and focus energy into productive channels for improving the scientific enterprise and the public understanding of it, to the best of our ability.

8 responses so far

Federal RePORTER-A New Tool of Science Data Wonks

(by datahound) Sep 26 2014

Recently, Drugmonkey put up a post with the understated title Federal RePORTER!!!!!!!!!!!!! He noted to a new project from the Star Metrics program with a version of the NIH RePORTER website that now has data from NSF, USDA, and EPA, in addition NIH. Needless to say, I could not resist having a look.

One question that occurred to me right away is how many NIH funded investigators also have NSF funding. A quick download from Federal RePORTER and I had an answer (given my previous work on NIH data).

For FY2013:

25361 investigators had R-mechanism funding from NIH

11440 awards were listed on Federal RePORTER from NSF corresponding to 10260 unique investigators (with some uncertainty due to potential name overlap)

196 individuals were on both lists.

This strikes me as a surprisingly low number, corresponding to a few investigators per institution. However, I grew up in chemistry departments which is likely an area where funding from both NIH and NSF is most common.

Suggestions about other questions are welcome although the data available from Federal RePORTER is still limited (e.g. only back to 2004).

7 responses so far

Older posts »