Collating CRediT data across publishers to see the potential for improving recognition



A Twitter discussion on this topic has been initiated by eLife. PLOS, Cell, GigaScience and Aries were tagged for response. Please share responses here.


If external data-sharing tools would also help we can link out from here.


From the Aries/Editorial Manager perspective, it is up to publishers using the CRediT integration to decide whether or not they would like to share their data. EM does support exporting this data. As a reminder, in EM in addition to selectable CRediT roles, journals can also opt to collect degrees of contribution for each role, and support the entry of “other” roles not currently within the taxonomy.


Can you please share the link to the Twitter discussion, I’ve not been able to navigate my way there. Thank you.


Hi Holly,
This link should get you there.


My colleague Helen Atkins (Director, Publishing Operations) responded on behalf of PLOS with this tweet: “We (@PLOS) have CRediT data; it’s available, fully XML tagged, and ready for analysis! Let’s talk…”


Hi Helen,

You mentioned these data in the Oct Project Credit conf call. Would it be possible for me to get a copy of these data?

As you recall I am working on a book chapter that will discuss the Contributor Roles Taxonomy and I think these data would be useful.

I will of course respect any restrictions you want to place on use of the data.

I am not sure how much actual analysis I can do, but I expect I can do some descriptive stats that would be useful.

I am happy to cite other uses of the data in addition to or in lieu of getting the data – but it might be helpful.

Thanks for considering,


Hi Cory,

CRediT data as provided by authors is included in the JATS XML of PLOS content. You can download the entire PLOS corpus of XML (over 200,000 research articles) from this page: (note: it’s about 5GB so make sure you have time and space for the download). You graciously volunteered to respect any restrictions…and there aren’t any beyond the CCBY licensing. You can read more about the PLOS commitment for text and data mining here:

Within the individual article XML file, the CRediT data is reflected as follows below. Here’s an excerpt for the first author from the downloadable XML available for this article

Davies Gerald Investigation Methodology Writing – review & editing 1

I hope that helps!


Well darn - The Forum editing tool removed the XML mark-up elements. You can email me at mjohnson at plos dot org and I will email you the XML markup showing the CRediT elements in the XML.

I should also note that not all of the 200K PLOS research articles will have the CRediT data; only those publishing recently (2016 or later) will have the CRediT author tagging.

Hope that helps!


Mark - the tool can display markup. Get in touch with me or @AriesAlison and we can help.


Hi David and Mark:

Mark: thanks so much for the info on CRediT usage data.

David: I think Mark was sending this to me for possible use in the chapter I am writing. Let me get thru a few deadlines & I will get in touch with you and Alison.
My first take: analyzing these data is likely more than I can manage before my 1/31/18 deadline. But I would definitely be interested in seeing what I can do with it after that. Let me take a look over the holidays and see what I can do.

Thanks to everyone,


Hi @AriesAlison & @David_Baker - thanks for the reassurances that displaying markup is possible. I’m trying again in hopes this helps @cory and anyone else looking to mine JATS for CRediT labels in the markup.

Here’s an excerpt for the first author from the downloadable XML available for this article

<contrib contrib-type="author" xlink:type="simple">

<name name-style="western">




<role content-type="">Investigation</role>

<role content-type="">Methodology</role>

<role content-type="">Writing – review & editing</role>

<xref ref-type="aff" rid="aff001"><sup>1</sup></xref>


Hope that helps!

PS - For those curious, the trick is to put 3 backticks in in separate lines above and below the markup.