Monday, 20 June 2022

Medical Research Foundation joins Europe PMC

We are delighted to announce that the Medical Research Foundation joins Europe PMC as a new funder. This brings the Europe PMC funder family to 37 members.

The Medical Research Foundation is the charitable foundation of the Medical Research Council. With support from the scientific and medical communities and the public, the charity funds high-quality medical research that improves human health and changes people’s lives. The Foundation is a purely research-led organisation, meaning all its efforts are centred around funding scientists and research. This allows them to focus solely on finding and funding research that has the potential to change lives now, or may become important in the future.

Researchers funded by the Medical Research Foundation will join thousands of others who make their published research articles freely available from Europe PMC as soon as possible without any embargo period. If your research is supported by the Medical Research Foundation you can submit your published manuscript for inclusion in Europe PMC via Europe PMC plus manuscript submission system.

You can now find publications supported by the Medical Research Foundation, as well as Medical Research Foundation grant awards via Europe PMC search and the Grant finder tool. 

For more information about joining Europe PMC funder group, visit our website: http://europepmc.org/Joining

Tuesday, 17 May 2022

Health and Care Research Wales joins Europe PMC funders group

 We are delighted to announce that the Health and Care Research Wales joins Europe PMC as a new funder. This brings the Europe PMC funder family to 36 members.

Health and Care Research Wales is a networked organisation which brings together a wide range of partners across the NHS in Wales, local authorities, universities, research institutions, third sector, and others. Health and Care Research Wales aims to ensure that today’s research makes a difference to tomorrow's care. To achieve this goal Health and Care Research Wales brings together partners to promote research into diseases, treatments, and services, which can lead to discoveries and innovations to improve and save lives. Health and Care Research Wales is supported by the Welsh government. 

Health and Care Research Wales researchers will join thousands of others who have made a commitment to making research open access through inclusion of research articles in Europe PMC. Health and Care Research Wales requires all peer-reviewed research articles submitted on or after 1st September 2022 to be published under the Creative Commons attribution licence (CC BY) (or Open Government Licence (OGL) when subject to Crown Copyright), made open access, and to be included in Europe PMC as soon as they are published without any embargo period. Authors are strongly encouraged by Health and Care Research Wales to self-archive peer-reviewed research articles submitted before 1st September 2022 in Europe PMC. If your research is funded by Health and Care Research Wales you can submit your published manuscript for inclusion in Europe PMC via Europe PMC plus manuscript submission system.


Tuesday, 3 May 2022

Europe PMC improves discoverability of preprints

Europe PMC now includes the full text preprints supported by Europe PMC funders

Open science is at the heart of Europe PMC, providing access to open content and data. Recognising the role that preprints play as a way for life science researchers to openly and rapidly share their findings, Europe PMC has made over 420,000 preprint abstracts from 24 preprint servers discoverable alongside journal publications. Following the success of the COVID-19 full text preprints initiative, which currently includes over 31,000 full text COVID-19 preprints, Europe PMC is expanding the number of searchable full text preprints to include those supported by Europe PMC funders. Overall this new project aims to increase the discoverability of science reported in preprints, expand the collection of full text preprints for future analyses, as well as improve visibility of preprints supported by Europe PMC funders

Which preprints are included?

From April 1st 2022, Europe PMC includes the full text of preprints that acknowledge funding from at least one of the 36 Europe PMC funders and have a Creative Commons licence. As the first step Europe PMC has added preprints from medRxiv, bioRxiv, and Research Square, with plans to expand to other preprint servers in the future. 

How does it work?

Europe PMC converts the freely available full text in PDF format to a machine-readable XML format suitable for text-mining. A preview of how the preprint will appear in Europe PMC is then shared with the corresponding author. The full text is added to Europe PMC two weeks later or immediately after author approval. The full text of preprints supported by Europe PMC funders is made searchable along with other preprint abstracts through the Europe PMC website as well as programmatically via the API. It is also available for bulk download as part of the Preprints subset for future analyses. 

What are the benefits?

While the full text of each preprint is openly available from the corresponding preprint server, there are numerous advantages to including it in Europe PMC. 

Being able to view the full text directly on Europe PMC makes it more convenient to users and makes research presented in preprints more discoverable. For preprint authors supported by Europe PMC funders this means higher visibility and wider reach for their scientific findings. 

By default Europe PMC search applies to the full text of journal articles and preprints that are indexed, not just abstracts. Therefore, having the full text of these preprints within Europe PMC means that they are surfaced if terms searched for are beyond their abstract. It also enables advanced search options, for example the ability to limit search to specific sections of the preprint, for example Figures, Results, or Methods.

Making the full text of preprints available programmatically in a structured machine-readable format also supports text and data mining. The Europe PMC text and data mining pipeline, in collaboration with several text-mining groups, identifies key biological entities, such as data accessions or gene/protein names, experimental methods, protein interactions, mutations, gene-disease relationships and more, in the abstracts and available full text of preprints. This enables better linking of the literature and the data behind it. The text mining pipeline powers the Annotations tool, which allows readers to quickly scan preprints of interest to find data and evidence presented in the manuscript. 

Full text can also support future research on research, for example around the impact of peer review or data availability. For users carrying out bioinformatic studies or literature reviews, having open access to the full text preprint collection from multiple preprint servers both on the website and programmatically via RESTful APIs in Europe PMC makes analysis easier and further supports open sharing of data.

Finally, as an archive of scholarly content, Europe PMC contributes to longevity and continued access to scientific data and findings presented in preprints. We believe that preprints can remove barriers to open science and Europe PMC is committed to making the science reported in preprints more widely discoverable. 

For more information about preprints in Europe PMC, visit our website: https://europepmc.org/Preprints

Thursday, 3 March 2022

SciELO Preprints discoverable in Europe PMC

 

We are delighted to announce that SciELO Preprints are now discoverable in Europe PMC. 

SciELO (Scientific Electronic Library Online) is a bibliographic database, digital library, and cooperative electronic publishing model of open access journals. It was originally established in Brazil in 1997 and has since expanded to include collections from 16 countries, predominantly in Latin America. 

In 2020 SciELO and the Public Knowledge Project (PKP) launched the SciELO Preprints Collection to accelerate the availability of research articles and other scientific communications.

As an avid supporter of open science, Europe PMC has been indexing life science preprints alongside journal articles since 2018. Currently, over 400,000 preprints from over 20 different platforms are available in Europe PMC. Preprints in Europe PMC are enriched with links to open peer review materials, related data, citing articles, and other useful resources. 

Over 1000 SciELO Preprints are available in Europe PMC in their original language, Portuguese, Spanish, or English, and can be accessed using the following search: PUBLISHER:"SciELO Preprints".

            An example of a SciELO preprint in Europe PMC. 

The preprint page displays the title, abstract, and author information. The preprint is linked to the journal published version from the preprint banner, as well as preprint reviews, including the recent integration between SciELO Preprints and PREreview, and also to the citation information and alternative metrics from the Citations & impact section on the left hand side. Readers can view genes, diseases and organisms mentioned in the preprint under the Annotations tool on the right hand side. The preprint can also be easily added to the ORCID profile by the authors using the Claim to ORCID option on the right.

An important outcome of the new collaboration between Europe PMC and SciELO is the push for changes to scholarly infrastructure to better handle multilingual content. Support for multilingual metadata is now part of Crossref’s public roadmap. Implementation of these changes would enable Europe PMC to host Portuguese, Spanish, and English versions for SciELO Preprints. But much more importantly, it could mean greater accessibility and discoverability of multilingual research across many scholarly platforms.

If you are interested to learn more about SciELO Preprints in Europe PMC, please register to join our live demo at 14.00 (GMT) on April 13th. 


Monday, 21 February 2022

Europe PMC adopts the Principles of Open Scholarly Infrastructure

As a long-standing service and infrastructure provider in the open science ecosystem, Europe PMC supports the Principles of Open Scholarly Infrastructure (POSI). We welcome the momentum gathering behind this initiative to promote the need to support and sustain the open infrastructure.

Europe PMC has been a part of the public and open infrastructure for over 15 years and is run and managed by EMBL-EBI (which is part of the pan-European organisation of EMBL). It is funded by 34 international funders and is community-driven, open infrastructure, set in the context of key global open data resources such as the European Nucleotide Archive (INSDC), the wwPDB and the European Genome-Phenome Archive. All of these resources exist for the public good, led by scientific need and international collaborations, and have open governance structures and a commitment to long-term sustainability. Together with PMC USA, Europe PMC is a part of the PubMed Central International archive network, which plays an integral part in fulfilling shared goals to enable international open science. Europe PMC has been selected as an ELIXIR Core Data Resource, which means that it is of fundamental importance to the wider life-science community and the long-term preservation of biological data.

Since the original POSI blog post by Bilder, Lin and Neylon (2015), a number of organisations that provide open infrastructure have adopted the principles, for example Crossref, ROR, OurResearch and DataCite, many of which Europe PMC collaborates with.

We decided to test the Europe PMC open infrastructure against POSI. By completing the review we were able to identify areas of the principles that align, but also others that do not map well to publicly funded services and organisations. We hope that the details below will provide another use case to extend POSI to include a greater diversity of open infrastructures. 

We also thank Geoffrey Bilder (one of the authors of POSI) and Ed Pentz from Crossref for a constructive discussion about POSI and their perception of the principles and how Europe PMC aligns with them. 

How Europe PMC meets the Principles of Open Scholarly Infrastructure (POSI)

Governance

🟢  Coverage across the research enterprise

🟢  Stakeholder governed

🟢  Non-discriminatory membership

🟢  Transparent operations

🟢  Cannot lobby

🟡  Living will

🟢  Formal incentives to fulfil mission & wind-down


Sustainability

🟡  Time-limited funds are used only for time-limited activities

🔴  Goal to generate surplus

🔴  Goal to create contingency fund to support operations for 12 months

🟢  Mission-consistent revenue generation

🟢  Revenue based on services, not data


Insurance

🟡  Open source

🟢  Open data (within constraints of privacy laws)

🟢  Available data (within constraints of privacy laws)

🟡  Patent non-assertion

Governance

🟢 Coverage across the research enterprise

It is increasingly clear that research transcends disciplines, geography, institutions and stakeholders. The infrastructure that supports it needs to do the same.

Europe PMC focuses on content and data from the life sciences, however Europe PMC contains content that is cross-disciplinary, for example from social sciences, chemistry and physics. The infrastructure and content ingest process in both cases is generic and utilises cross-discipline community standards (such as the JATS XML data model used by the publishing community to structure scholarly content). Therefore, the infrastructure is discipline agnostic. Content comes from worldwide sources and the users are worldwide.   

🟢  Stakeholder governed

A board-governed organisation drawn from the stakeholder community builds more confidence that the organisation will take decisions driven by community consensus and consideration of different interests.

Europe PMC’s governance includes a Scientific Advisory Board and Funder Committee. The Scientific Advisory Board includes representative members of the open science community, including researchers, publishers, funders and text miners.

Europe PMC’s Funder Committee includes representatives of the 34 international funders who support Europe PMC. Current members of these groups can be found on the Europe PMC Governance page

🟢 Non-discriminatory membership

We see the best option as an “opt-in” approach with a principle of non-discrimination where any stakeholder group may express an interest and should be welcome. The process of representation in day to day governance must also be inclusive with governance that reflects the demographics of the membership.

We have assumed for the purposes of this principle that ‘members’ of Europe PMC are funders and we operate a principle of non-discrimination and any members are welcome. There are two criteria to join Europe PMC as a funder. Funders must be: 

  • Public organisations whose legal mission is research funding, or;

  • not-for-profit organisations (charities, foundations, associations) that fund research as a part of their mission.

Europe PMC also provides various services that anyone in the community can engage with, such as external links, preprints ingest and the annotations submission service. Use of these services is non-discriminatory but there are required criteria for participation. For example, preprints must meet certain minimum metadata standards and external links should be open and free to access and add value to the content.

🟢 Transparent operations

Achieving trust in the selection of representatives to governance groups will be best achieved through transparent processes and operations in general (within the constraints of privacy laws).

The Europe PMC Funder Committee terms of reference define the process and criteria for new funders joining. 

Europe PMC receives external grant funding and in-kind support from EMBL. Details of Europe PMC funding, which is provided by 34 funders of life science research and coordinated by Wellcome Trust, are available via Europe PMC’s grant finder. Details of how EMBL-EBI is funded are provided in the annual report.

Europe PMC undertakes user research and outreach activities on an ongoing basis to identify community and user needs. These are translated into new features and improvements, as appropriate. Europe PMC publishes a quarterly roadmap and in addition has published anonymised user research findings on Figshare.

🟢 Cannot lobby

The community, not infrastructure organisations, should collectively drive regulatory change. An infrastructure organisation’s role is to provide a base for others to work on and should depend on its community to support the creation of a legislative environment that affects it.

Europe PMC responds to regulatory changes, but does not lobby for them, or drive them. Europe PMC supports changes to funders’ open access policies. We contribute to the development of community standards, for example preprint metadata standards. Europe PMC responds to the needs of its community, for example by ingesting COVID-19 preprints and grants.

🟡 Living will

A powerful way to create trust is to publicly describe a plan addressing the condition under which an organisation would be wound down, how this would happen, and how any ongoing assets could be archived and preserved when passed to a successor organisation. Any such organisation would need to honour this same set of principles.

EMBL is an intergovernmental organisation, powering world-class research through tools, data and facilities. EMBL was established nearly 50 years ago in 1974, and is currently funded by 27 member states. An integral part of EMBL, EMBL-EBI has 25 years of experience in ensuring continuity of access to public infrastructure and data. EMBL-EBI manages the graceful retirement of public data services as part of its portfolio management, exemplified by the retirement of ArrayExpress archive, which has been seamlessly superseded by the BioStudies database in 2021.

Europe PMC is an archive. It aggregates, preserves, and enriches content posted or published elsewhere. The majority of the content in Europe PMC is available via partner databases and open infrastructure providers, including MEDLINE, PMC, Agricola, Crossref, and many others. The data that enriches this content includes text-mined annotations, grant information for Europe PMC funders, links to external resources, and other value-added services. Annotations are in W3 annotations format and external links are in Scholix format and are available via APIs. An end of life deposit of the available annotations and external links could be deposited to a repository for preservation. 

Europe PMC’s partnership with PMC USA ensures that if Europe PMC was to wind down, the existing abstracts and PMC full text content (including author manuscripts supported by Europe PMC funders) would still exist in another location, as demonstrated by the retirement of PMC Canada. The full text Preprints available in Europe PMC but not elsewhere would be made available as an end of life deposit to a repository for preservation.

🟢 Formal incentives to fulfil mission & wind-down

Infrastructures exist for a specific purpose and that purpose can be radically simplified or even rendered unnecessary by technological or social change. If it is possible the organisation (and staff) should have direct incentives to deliver on the mission and wind down.

As mentioned above, EMBL-EBI has experience and a history of responding where the community needs change. 

Europe PMC is the repository of choice for its 34 funders, enabling their researchers to self-archive their author accepted manuscripts to comply with the funders’ Open Access policies. The landscape of Open Access publishing is changing (driven, for example, by the  Plan S initiative). The need for Europe PMC is continually tested through the 5-year cycle of grant renewal and continuous monitoring of use. As Europe PMC is hosted by the EMBL-EBI as part of a portfolio of data services, its lifecycle is managed in the same way as resources, such as BioStudies, Ensembl, Uniprot and many others.

Sustainability

🟡 Time-limited funds are used only for time-limited activities

Day to day operations should be supported by day to day sustainable revenue sources. Grant dependency for funding operations makes them fragile and more easily distracted from building core infrastructure.

Europe PMC is funded by a combination of a grant (contributed to by 34 funders) and EMBL-EBI support. It has been running for 15 years and has grant funding secured until 2026. As a general rule, EMBL-EBI supports the core infrastructure and the grant supports both the maintenance of Europe PMC services and the development of new ones. Further project-specific grants may fund aligned developments that enrich Europe PMC, but the main Europe PMC service is not dependent on these for its core service delivery.  

🔴 Goal to create surplus

Organisations which define sustainability based merely on recovering costs are brittle and stagnant. It is not enough to merely survive, it has to be able to adapt and change. To weather economic, social and technological volatility, they need financial resources beyond immediate operating costs.

As a publicly funded service, Europe PMC does not create surplus funds. Rather, Europe PMC’s resilience is ensured in the broader context of EMBL-EBI and EMBL. 

🔴 Goal to create contingency fund to support operations for 12 months

A high priority should be generating a contingency fund that can support a complete, orderly wind down (12 months in most cases). This fund should be separate from those allocated to covering operating risk and investment in development.

As described above, EMBL’s commitment to Europe PMC effectively provides contingency but not in the form of a specific 12-month fund. 

🟢 Mission-consistent revenue

Potential revenue sources should be considered for consistency with the organisational mission and not run counter to the aims of the organisation. 

Europe PMC sits within the wider open data mission of EMBL-EBI. In that context, there are processes that assess grant applications to provide assurance that they are aligned with the organisational and resource missions.

🟢 Revenue based on services, not data

Data related to the running of the research enterprise should be a community property. Appropriate revenue sources might include value-added services, consulting, API Service Level Agreements or membership fees.

Europe PMC does not charge for use. A 2021 impact report by independent consultancy Charles Beagrie Ltd and 2019 report by Technopolis found that EMBL-EBI open data resources, of which Europe PMC is one, offer exceptional value for money. 

Insurance

🟡 Open source

All software required to run the infrastructure should be available under an open source license. This does not include other software that may be involved with running the organisation.

Only part of Europe PMC’s software is open source. Some parts of Europe PMC’s software dates back 20 years. As Europe PMC components are refactored and replaced over time, they are typically replaced with open source software. For example, since 2019, the Europe PMC plus manuscript submission system has been developed using PubSweet, a free, open source framework developed in collaboration with the Collaborative Knowledge Foundation (Coko) and community partners, including eLife and Hindawi. 

All of Europe PMC’s open source software is available via Europe PMC’s public-projects Gitlab repository

In December 2021 EMBL released its internal policy on open science and open access which states:

“EMBL expects all of the above types of software to be Open Source by default in both services and research, and made available in open/community software repositories.”

Europe PMC is committed to increasingly making its software open source. 

🟢 Open data (within constraints of privacy laws)

For an infrastructure to be forked it will be necessary to replicate all relevant data. The CC0 waiver is best practice in making data legally available. Privacy and data protection laws will limit the extent to which this is possible.

EMBL-EBI has at its core a mission to deliver open data. EMBL-EBI’s principles of service provision state: 

“Our data and tools are freely available, without restriction. The only exception is potentially identifiable human genetic information, for which access depends on research consent agreements.”

A public executive summary and further details for licensing of EMBL-EBI data resources is available on the EMBL-EBI webpage. Details on general copyright restrictions are available for Europe PMC because some content is protected by publisher copyright statements. Articles and other material in Europe PMC usually contain an explicit copyright statement.

🟢  Available data (within constraints of privacy laws)

It is not enough that the data be made “open” if there is not a practical way to actually obtain it. Underlying data should be made easily available via periodic data dumps.

Article, grant and annotations metadata in Europe PMC can be accessed freely with no registration requirement for the use of the website, APIs and the bulk download via FTP. Europe PMC provides an open access subset of full-text articles and COVID-19 preprints with more permissive Creative Commons type licenses (or similar), which is available using the APIs or as bulk downloads.

🟡  Patent non-assertion

The organisation should commit to a patent non-assertion covenant. The organisation may obtain patents to protect its own operations, but not use them to prevent the community from replicating the infrastructure.

Europe PMC does not hold any patents and we can not foresee any circumstances under which we would apply for one. In the event that we did need to apply for a patent, we would be willing to issue a patent non-assertion covenant to assure stakeholders and the community that we would not use a patent to prevent the community from using Europe PMC code or infrastructure.

Conclusion

As seen in the self-audit above, Europe PMC is committed to providing open data supported by high quality, sustainable, open and community driven infrastructure. This audit demonstrates some areas in which Europe PMC could improve, for example increasing our open source code proportion. However, it would be valuable to work with the POSI authors and other infrastructure providers in the community to revise and adapt POSI to ensure it is appropriate for publicly funded services and organisations.