In my first K99-R00 publication analysis post, I presented the distribution of the number of publications for the FY2007 K99-R00 cohort. In this post, I examine the distributions of the numbers of authors per paper.

For 134 K99 investigators for which I have been able to identify publications relatively unambiguously through PubMed and who transitioned to a R00 award, a histogramÂ of the number of authors per publication is shown below:

The median number of authors is 5 and the mean is 6.5. Of course, average number of authors per publication varies from investigator, reflecting different circumstances, areas of science, and other factors. The distribution is shown below:

How does the average number of authors per publication relate to the number of publications?

As this plot shows, these two parameters are correlated with a correlation coefficient of 0.47. Investigators with a larger number of publications tend to have more authors per publication.

One way to correct for the influence of an increased number of authors on the number of publications is to weight each publication by 1/(number of authors) (as was suggested by a comment on Twitter). In this scenario, a paper with two authors would be worth 1/2 while a paper with 10 authors would be worth 1/10.

This weighted sum of publication is plotted versus the total number of publications below:

This tightens the distribution substantially and the correlation coefficient increases to 0.83. One interpretation of this is that differences in authorship practices do influence the number of publication for a substantial part of the distribution. The investigators on the tail of the distribution with a large number of publications still lie on the edge of the distribution with this correction. In general, these investigators have been successful in obtaining substantial resources subsequent to their R00 awards, leading to an increased number of publications.

[…] final figure from Datahound's post on K99/R00 recipients who have managed to win R01 funding is fascinating to me. This is a plot of individual […]

What am I missing here? Your weighted analysis seems to argue something different. i.e., that even when you try to account for "authorship practices" the relationship still holds.

This assumes what you mean by "authorship practices" is a relative difference in the lab/subfield/etc expectation about the degree of contribution to a given paper that deserves an author credit. In the extreme, true "guest" or "courtesy" authorships where people are being put on without really contributing. Your weighted sum should dilute the impact of non-contributing authors. In fact what you are showing is that even when you do this, more authors = more pubs. Which suggests that on the whole authors are adding incremental scientific advance on an essentially linear basis.

What I am not correcting for (yet) is total support over this time period. Many of the investigators who have published a lot over this period have been successful in getting R01s and other grants. I am working on collating the relevant data.

and for clarification, this is all-publications for the FY2007 K99ers yes? so both before and after the K99?

Yep.

It would make sense that if an investigator were to have obtained substantial resources, then they would be able to hire more workers and the #authors/paper would increase as well as the rate of publication. This is assuming that many people contribute to each pub and many people are individually first-authoring a pub at any given point in time. It may also suggest that an investigator contributes to other labs' work substantially, which would have the same effect. I am not at all surprised by this. It seems that for tenure/hiring, one should continuously be working on first/last author pubs, while contributing a skill or resource to collaborators (or labmates, depending on your career level) as much as possible.

[…] in part, by Drugmonkey's post, I have been thinking further about the analysis I did relating the number of publications to the number of authors per publication. I realize that I did […]

[…] Sharp-eyed readers may note the absence of the investigator with the most (95) publications from my initial post on this subject. This investigator only received funding through the R00 phase before moving to a position in Europe so I removed this data point from the plot. The correlation coefficient for this parameters is 0.30. The correlation improves slightly t0 0.32 if each publication is weighted by 1/number of authors as described in the previous post. […]