As we enter into 2025, Europe PMC celebrates a previous year of innovation, growth, and user-driven improvements. This end-of-year reflection highlights our key achievements in developing cutting-edge AI solutions for scientific research, advances in open research software, and support for transparency in scholarly communication.
Using AI to revolutionise research discovery
To unearth critical insights from biomedical literature and accelerate scientific discovery, Europe PMC is using innovative AI approaches. Developed in collaboration with Open Targets, the Lit-OTAR framework used deep learning to identify over 48 million gene-disease-drug associations, helping researchers prioritise drug targets and design innovative treatments. This tool processes scientific articles daily, providing comprehensive data accessible via the Open Targets Platform and Europe PMC’s Annotations API.
To support protein research, Europe PMC and PDBe developed an AI–human hybrid workflow to link protein structure and function. By combining expert curation with machine learning, the workflow identifies key amino acid residues – accelerating functional insights, supporting drug target validation, and scaling the annotation process.
All the models and code for the Lit-OTAR framework and AI–human hybrid workflow projects can be found respectively at ML4Lit and PDBEurope GitLab repositories.
Supporting research software
Two years ago Europe PMC adopted the Principles of Open Scholarly Infrastructure (POSI), a set of guidelines to operate and sustain open scholarly infrastructure. Since then we have made significant progress towards greater transparency, sustainability, and openness. As part of our commitment, we develop all new code as open source and gradually update legacy systems to enable reuse. You can now find 27 open projects hosted on Europe PMC’s public GitLab repository. We believe that open source is more than just sharing your code – it’s about building projects that thrive. We invite you to collaborate, contribute, and repurpose our code to match affiliations, import preprints, or extract key terms from documents. Not sure where to start in creating or reusing open source code? I’m ready to take the first step and learn from EMBL Data Scientists for free.
Modern research increasingly relies on software, yet it often lacks appropriate recognition and persistent identifiers, limiting its discoverability and reproducibility. To address this issue, Europe PMC joined a collaborative initiative dedicated to improving the visibility and reuse of open research software, called the SoFAIR project. This project will leverage machine learning and community validation to make software citations available through the Europe PMC Annotations platform. This initiative will not only enhance software visibility and support the reuse of research code, but also ensure that developers receive credit for their work, in line with the FAIR principles for open research outputs. Our participation in SoFAIR underscores our commitment to integrating open research software with life sciences literature, driving innovation, and creating a more connected research ecosystem.
User-focused design
At Europe PMC, we prioritise user-driven design. In 2024, the team made several updates based on user feedback and research. One key improvement was making the sort order more prominent to enable users to quickly choose between relevance, date, and times cited easily. Previously displayed as a drop-down menu, we redesigned the sorting options and tested the changes using A/B testing – a method of comparing two versions of content by randomly splitting the audience and evaluating performance based on engagement. This approach confirmed that users preferred the new, more accessible design to sort their results. We also revamped the date filter to display a timeline view and offer a more visual and intuitive way to narrow down searches.
We are looking to the future of search, trying to anticipate future user needs and changes in scholarly communication. We are exploring the strengths, and limitations, of the current search tools and have run focus groups to understand your challenges and aspirations around literature research. These insights will guide our efforts to support you with newer, better tools that evolve alongside your expectations. Be the first to know what’s coming next – explore the 2025 developments on the Europe PMC roadmap!
Expanding coverage
Europe PMC continues to grow as a comprehensive database for key scientific research. In 2024 we added preprints from MetaArXiv and VeriXic, now 35 preprint servers are indexed by Europe PMC. As of December 2024, Europe PMC provides access to over 880,000 life science preprints, ensuring you stay informed about the latest developments in the field.
This year we have also updated our collection of PhD theses from UK higher education institutions. These records are provided by EThOS, the British Library’s e-thesis online service. Currently, users can access over 176,000 biomedical theses in Europe PMC. Making theses available alongside publications broadens the available knowledge base, encourages the sharing of ideas within the academic community, and provides valuable context for ongoing advancements in the life sciences.
Transparency for preprints and publications
As in life, mistakes can occur at any stage of the research process. Erratums and retractions allow scientists to make changes to published literature. This prevents the spread of outdated or incorrect information, and thus erosion of public trust in research.
When citing a research work it is important to know whether an article was retracted or corrected, or if a preprint was withdrawn, removed, updated with a new version, or published in a journal. We have included this crucial information in citation files as part of Europe PMC’s commitment to transparency. This means that major publication updates are now visible to the user when a record from Europe PMC is added to a citation manager. This mitigates the risk of unknowingly citing invalid or outdated work and empowers researchers to make responsible decisions about the sources they cite.
Preprints are much more dynamic compared to journal articles, but in the absence of automated updates changes to preprints can be difficult to track. We have significantly improved the detection of withdrawn and removed preprints in Europe PMC by monitoring title changes – a new, evolving practice by preprint servers to prefix titles with standard text, such as “Withdrawn”. This has enabled us to automatically tag over 2000 withdrawn and removed preprints, ensuring that users access accurate, up-to-date information, whether by reaching an article via the search bar or utilising our Article Status Monitor tool.
Since initially indexing preprints 6 years ago, we have worked hard to build trust in preprints and have recently published a summary of our efforts. Our latest publication, “Enabling preprint discovery, evaluation, and analysis with Europe PMC” outlines our efforts to create a database of preprints and support innovation in scholarly communication.
Final remarks
In 2024, Europe PMC focused on increasing coverage of life science literature, accelerating scientific discovery with innovative AI tools, and supporting open source and transparency. As we move into 2025, we remain committed to delivering a comprehensive, evolving resource that meets the needs of the life science community. To keep up to date with innovations in 2025, please look at the Europe PMC Roadmap.