Wednesday, 11 November 2020

Enriching Europe PMC publications with Metagenomics annotations

We are excited to announce the recent release of metagenomics annotations for Europe PMC publications. Thanks to our joint work with MGnify on the EMERALD project, recognising metagenomics key terms in literature can now provide detailed biome descriptions for unravelling microbial implications in a variety of environmental-, engineered- and host-associated phenomena. Using a mixture of a literature-based machine learning (ML) and natural language processing (NLP) approaches, terms pertinent to 16 novel metagenomics concepts were identified in Europe PMC literature. Those concepts cover entities related to biome sample and origin as well as metagenomics experimental methods, and are all available in the Europe PMC annotations platform. Check the table below to learn about these concepts (annotation types) and their descriptions.

Annotation type



The organism where the microbiome is found


Microbiome's man-made environment


Micrbiome's natural environment


Sampling date


Microbiome's place or geocoordinates


Microbiome's site within place


Host body region/structure where microbiome is found


Host/Environment state


Sample from which the microbiome is extracted


Host/Environment treatments


Nucleic acid extraction-kit


Target gene(s) (e.g. hypervariable regions of 16s/18s rRNA gene)


PCR primers


Library strategy (e.g amplicon, whole metagenome)


Library construction method (e.g. paired-end, single-end)


Sequencing platform

Publications with the new annotations are searchable on Europe PMC using the search syntax (ANNOTATION_PROVIDER:"Metagenomics"). Entity types can be highlighted in the text of open access articles using the SciLite annotations tool.

Additionally, these annotations are available for programmatic access via Europe PMC search and annotation APIs. 

Explore metagenomics annotation types in Europe PMC and give us your feedback. Want to know more about Europe PMC annotation platform? Get in touch with

Tuesday, 22 September 2020

Announcing the new version of SciLite - the Europe PMC tool for highlighting annotations

This month, Europe PMC released a new version of SciLite, a powerful tool for highlighting annotations in life sciences publications. SciLite is powered by the Europe PMC annotation platform via the open annotation API, which provides access to over 1.3 billion annotations. Highlighting annotations in the text enables users to easily scan the article and locate key biological entities, such as genes/proteins, accession numbers, protein interactions, diseases, gene-disease relationship and more.

SciLite annotations has been redesigned to improve the speed of information retrieval and assist scientists and database curators to scan articles, extract facts and evidence from the biomedical literature, as well as locate the primary data cited in a given publication. Imagine yourself being able to locate and quickly visualise a protein structure of your interest on a single page! This is how SciLite helps with speeding scientific discovery.

Annotations can be accessed via the ‘Annotations’ option on the right-hand side on the article page. Clicking on ‘Annotations’ opens a new panel and selected terms will be highlighted in the text of the article. Notice that annotations can only be highlighted on articles with an open access license. 

The new SciLite version includes new annotation types, a new annotation panel interface and new features. Annotation types now include cell, cell lines, clinical drugs, molecular processes, organ tissues, pathway, anatomy and phenotype. The new panel offers easy navigation through annotations and displays a popup window with a quick link to relevant data resources.

Additionally, the new version offers a chemical structure viewer. Readers are able to visualise protein and chemical structures in the annotation panel as well as in the highlighted text.

Highlighted annotations display links to relevant database records enabling users to locate the primary data in the text by linking text mined and curated bioentities to public life sciences databases. Additionally, the new improvements include options to endorse, report errors or share the annotation via a linkback URL.

Identifying a plethora of complex biological terms and concepts in publications was made possible due to a successful partnership with a variety of text mining groups that use text mining algorithms to identify different types of biological entities, and enable text miners to showcase their work to a wider public via SciLite annotations. 

Europe PMC acknowledges all the annotation providers for cooperating towards submitting their annotations and welcomes new contributions from text-mining and curation communities to share annotations via the annotations submission service. Europe PMC would also like to thank all the participants who took part in usability sessions, to test and feedback on the improvements to the Scilite Annotations tool.

Want to know more about SciLite, annotation APIs or submitting annotations? Get in touch with

Tuesday, 28 July 2020

Europe PMC partners with UKCDR to release the Covid-19 Awards Search

As a response to the current pandemic crisis, organisations worldwide have increasingly made funding available for COVID-19 research. As part of the COVID CIRCLE programme, the Medical Research Council (MRC) has worked with the UK Collaborative on Development Research (UKCDR) and the Global Research Collaboration for Infectious Disease Preparedness (GLoPID-R) to make COVID-19 grant information from international funders more widely available and accessible.  This COVID-19 grant data is now searchable via Europe PMC. Europe PMC, the database of biomedical literature, supports grant search via the Grant finder tool which allows searching biomedical research grants awarded by the Europe PMC funders. To date, the initiative has resulted in inclusion in the Grant Finder of funding information for over 1,800 COVID-19 grants awarded by more than 25 funders from countries across the globe. Note that many of these grants have been jointly-funded. The grant information is accessible to website users via a 'COVID-19' tab in the grant finder tool, and can be programmatically accessed via the Europe PMC GRIST API.
The team at the MRC led by Dr Ian Viney, MRC Director of Strategic Evaluation and Impact, has been responsible retrieving the grant data from the Covid-19 Tracker as it updates and feeding it in a machine-readable format to the team behind Europe PMC to load it into the grant finder. 
Dr Viney commented “It is now crucial that research organisations focus on the gaps and opportunities in the busy coronavirus research landscape.  Tracking the research underway globally and in a timely fashion will allow resources to be targeted to best effect, and with the launch of the Europe PubMed Central covid-19 grant finder we now have the tools in place to keep everyone informed.”
For more information, please contact

Monday, 8 June 2020

Europe PMC to include the full text of COVID-19 preprints

In this pandemic, researchers have responded by publishing results rapidly, often through preprints.
In fact, up to half of the publications in Europe PMC on COVID-19 are preprints rather than peer-reviewed journal articles. Currently, the full text of these preprints are scattered as PDFs on preprint servers, or, available as a non-standard set of documents for machine learning purposes. 

In a new project, supported by Wellcome in partnership with the UK Medical Research Council (MRC) and the Swiss National Science Foundation (SNSF), we will make the full text of COVID-19 preprints available on Europe PMC, a large and sustainable life sciences archive, for reading and reuse via a standard XML format, alongside peer reviewed full text articles. 

“Being able to easily search and read preprint full text on a site already frequented by millions of users means that they will be significantly more discoverable by people, and more open to scrutiny”, says Jo McEntyre, Associate Director of EMBL-EBI and Head of Literature Services. “We will make use of existing infrastructure to integrate these COVID-19 preprints into the typical ecosystem of publications - for example, linking to the underlying data”.

This will accelerate scientific research on COVID-19, provide an opportunity to build new open and rapid publication systems, and form a corpus for future history of science research.

"The COVID-19 pandemic has the potential to be the "Napster moment" for preprints, which in turn could herald a way in which research outputs are shared in the future.  Ensuring preprints are fully discoverable and reusable will help ensure that they are treated as first class research objects.  This initiative to make them fully discoverable through the Europe PMC repository will help to realise this goal."
Said Robert Kiley, Head of Open Research at Wellcome

In the coming weeks and months, we will be engaging with preprint servers and preprint authors to make as many COVID-19 preprints as possible available for reading and reuse via Europe PMC. If you are the corresponding author on a preprint on SARS-CoV-2 or COVID-19, published on one of the major preprint platforms, we will soon be asking you to review the Europe PMC version of your work.

Europe PMC has been indexing preprint abstracts since 2018 and has the experience in managing full text workflows at large scale. A total of approximately 150K preprint abstracts currently sit among all of PubMed (31M abstracts) and 6M full text articles shared with PMC in the USA. These are XML-based workflows based on international standards (JATS). Europe PMC has played a leading role in preprint community standards development and has hands-on expertise in publication workflows that include versions across multiple sources with diverse approaches.

In addition, Europe PMC has several mechanisms to integrate related material - whether it is data behind the paper to substantiate conclusions, ORCID claiming of preprints, inclusion of preprints in citation networks, or to comments on peer review platforms, impact metrics, or links to reagents.

Jessica Polka, Executive Director of ASAPbio said:

"This exciting initiative will create a reading experience that is not only more seamless, but also potentially richer. Openly licensed documents can be text mined and enhanced with SciLite annotations, adding context for readers. The inclusion of the full text of COVID-19 preprints in Europe PMC is an important step in the integration of preprints into the scholarly communications infrastructure."

To find out more about this initiative, read the Preprints in Europe PMC page or write to Europe PMC at:

Thursday, 4 June 2020

Global grant IDs in Europe PMC

In September 2019 we welcomed the first open, global grant IDs to Europe PMC. Wellcome became the first funder to register DOIs for their grant awards with Crossref. Grant metadata is provided by Europe PMC, and in a pilot integration, the grant IDs were featured in a PLOS ONE publication. This is a very exciting development that will hopefully make it easier to effectively track the impact of research funding. So what is the big news about?

What are grant IDs?

Open, global grant identifiers have been in the making for some time. Though grant numbers are already in use by many funders, they are often local and based on an internal pattern. 
This might create some issues when looking up grant information provided by different funders - for a grant with a local ID 207467 two records exist in Europe PMC, one awarded by Wellcome and another by the European Research Council.
One of the proposed solutions is adoption of global grant IDs in the form of DOIs, which offer the advantages of being unique, persistent, and easy to integrate into existing systems.
To create DOIs for their grant awards funders need to follow three simple steps:
  • join Crossref as a member;
  • register associated metadata, such as grant award title, amount, currency, date, etc;
  • provide an openly available online resource to which the DOI would resolve - a landing page describing the grant.

Creating grant DOIs

As the first step in the process, Wellcome became a Crossref member through a new type of membership developed specifically for this purpose. It is distinct from the publisher membership and is designed for funding agencies. Upon registration Wellcome was assigned a unique DOI prefix.

DOI example for a Wellcome-supported grant.
For the second step, Europe PMC registered global grant IDs for Wellcome-funded awards with Crossref on behalf of the funding agency. Because Europe PMC already runs an open database of grant awards for all 31 Europe PMC funders, we were best placed to provide Crossref with the necessary metadata. 
The metadata schema is comprehensive and allows for a detailed overview of the grant. Europe PMC presents the information registered for most of the fields, including the funding percentage for grant awards supported by multiple funders, or ORCIDs for investigators on the grant (see scheme below). The full schema and documentation are available on GitHub.

Metadata fields that Europe PMC provides to Crossref when registering a global grant ID.

Greater transparency for funding information

The beauty of a DOI is that it points to a physical location, some place where an interested user can see the details of a particular grant. For Wellcome, that place is a grant record on Europe PMC’s website. This means that the funding information becomes more transparent and can be easily linked to the research output.
Example of the grant record on the Europe PMC website. This record matches the grant DOI referenced in the PLOS ONE article below.
In a pilot effort the journal PLOS ONE coordinated with Wellcome-funded authors to include newly registered global grant IDs in the metadata of the publication. This means that the readers can now seamlessly navigate from the article to the grant record and examine the support provided by Wellcome for this particular study.

Global grant IDs for two Wellcome grants featured in the Funding section of a PLOS ONE publication. The link for the first of these Wellcome grants leads to the grant record shown in the figure above.
Notably, all of this grant-associated metadata is freely available not only on Europe PMC’s website but also programmatically, through the public GRIST API. The newly created global grant ID along with the local grant number have been incorporated into the API response. The information will also be available via Crossref’s APIs later in 2020. 

Future plans

DOIs for grants have been registered on behalf of Wellcome for 237 grants awarded in 2019. Grant IDs will be assigned retrospectively to Wellcome grants awarded and registered in Europe PMC’s GRIST database from years prior to 2019. This will encompass approximately 13,500 Wellcome grants currently available in Europe PMC. 

The adoption of global grant IDs also allows us to create a more interlinked PID (persistent identifier) graph - as Europe PMC hosts data for both publications and grant awards we are well-positioned to link publication DOIs with DOIs for grants, supporting better tracking of the research funding impact. 
We hope that by implementing global grant IDs, grant data can be easily collected on submission by publishers and repositories and automatically fed into researcher assessment platforms, thereby simplifying researchers’ workflows.

For more information

Europe PMC funders, please contact if you’d like more information about registering grant DOIs. 
For more general information from Crossref please see their website.
Europe PMC’s contribution to this work has been funded by the Europe PMC funders and the European Commission: FREYA - Connected Open Identifiers for Discovery, Access and Use of Research Resources (777523)