News blog

Updates from Europe PMC, a global database of life sciences literature

Europe PMC team

 | 3 May 2022

 | 3 MINS READ

 | Editor's pick

Europe PMC improves discoverability of preprints


Europe PMC now includes the full text preprints supported by Europe PMC funders

Open science is at the heart of Europe PMC, providing access to open content and data. Recognising the role that preprints play as a way for life science researchers to openly and rapidly share their findings, Europe PMC has made over 420,000 preprint abstracts from 24 preprint servers discoverable alongside journal publications. Following the success of the COVID-19 full text preprints initiative, which currently includes over 31,000 full text COVID-19 preprints, Europe PMC is expanding the number of searchable full text preprints to include those supported by Europe PMC funders. Overall this new project aims to increase the discoverability of science reported in preprints, expand the collection of full text preprints for future analyses, as well as improve visibility of preprints supported by Europe PMC funders

Which preprints are included?

From April 1st 2022, Europe PMC includes the full text of preprints that acknowledge funding from at least one of the 36 Europe PMC funders and have a Creative Commons licence. As the first step Europe PMC has added preprints from medRxiv, bioRxiv, and Research Square, with plans to expand to other preprint servers in the future. 

How does it work?

Europe PMC converts the freely available full text in PDF format to a machine-readable XML format suitable for text-mining. A preview of how the preprint will appear in Europe PMC is then shared with the corresponding author. The full text is added to Europe PMC two weeks later or immediately after author approval. The full text of preprints supported by Europe PMC funders is made searchable along with other preprint abstracts through the Europe PMC website as well as programmatically via the API. It is also available for bulk download as part of the Preprints subset for future analyses. 

What are the benefits?

While the full text of each preprint is openly available from the corresponding preprint server, there are numerous advantages to including it in Europe PMC. 

Being able to view the full text directly on Europe PMC makes it more convenient to users and makes research presented in preprints more discoverable. For preprint authors supported by Europe PMC funders this means higher visibility and wider reach for their scientific findings. 

By default Europe PMC search applies to the full text of journal articles and preprints that are indexed, not just abstracts. Therefore, having the full text of these preprints within Europe PMC means that they are surfaced if terms searched for are beyond their abstract. It also enables advanced search options, for example the ability to limit search to specific sections of the preprint, for example Figures, Results, or Methods.

Making the full text of preprints available programmatically in a structured machine-readable format also supports text and data mining. The Europe PMC text and data mining pipeline, in collaboration with several text-mining groups, identifies key biological entities, such as data accessions or gene/protein names, experimental methods, protein interactions, mutations, gene-disease relationships and more, in the abstracts and available full text of preprints. This enables better linking of the literature and the data behind it. The text mining pipeline powers the Annotations tool, which allows readers to quickly scan preprints of interest to find data and evidence presented in the manuscript. 

Full text can also support future research on research, for example around the impact of peer review or data availability. For users carrying out bioinformatic studies or literature reviews, having open access to the full text preprint collection from multiple preprint servers both on the website and programmatically via RESTful APIs in Europe PMC makes analysis easier and further supports open sharing of data.

Finally, as an archive of scholarly content, Europe PMC contributes to longevity and continued access to scientific data and findings presented in preprints. We believe that preprints can remove barriers to open science and Europe PMC is committed to making the science reported in preprints more widely discoverable. 

For more information about preprints in Europe PMC, visit our website: https://europepmc.org/Preprints

Post a comment


I agree to the limited use of my personal data as described in the Europe PMC advanced user services privacy policy.

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Subscribe to the Europe PMC News blog to receive the latest updates

This website requires cookies, and the limited processing of your personal data in order to function. By using the site you are agreeing to this as outlined in our privacy notice and cookie policy.

Partnerships & funding

Europe PMC is a service of the Europe PMC Funders' Group, in partnership with EMBL’s European Bioinformatics Institute (EMBL-EBI); and in cooperation with the National Center for Biotechnology Information (NCBI) at the U.S. National Library of Medicine (NCBI/NLM) . It includes content provided to the PubMed Central (NLM/PMC) archive by participating publishers.