Citation

BibTex format

@article{Niu:2024:10.1038/s41597-024-03793-0,
author = {Niu, Z and Xiao, X and Wu, W and Cai, Q and Jiang, Y and Jin, W and Wang, M and Yang, G and Kong, L and Jin, X and Yang, G and Chen, H},
doi = {10.1038/s41597-024-03793-0},
journal = {Scientific Data},
title = {PharmaBench: enhancing ADMET benchmarks with large language models},
url = {http://dx.doi.org/10.1038/s41597-024-03793-0},
volume = {11},
year = {2024}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - Accurately predicting ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) properties early in drug development is essential for selecting compounds with optimal pharmacokinetics and minimal toxicity. Existing ADMET-related benchmark sets are limited in utility due to their small dataset sizes and the lack of representation of compounds used in drug discovery projects. These shortcomings hinder their application in model building for drug discovery. To address this issue, we propose a multi-agent data mining system based on Large Language Models that effectively identifies experimental conditions within 14,401 bioassays. This approach facilitates merging entries from different sources, culminating in the creation of PharmaBench. Additionally, we have developed a data processing workflow to integrate data from various sources, resulting in 156,618 raw entries. Through this workflow, we constructed PharmaBench, a comprehensive benchmark set for ADMET properties, which comprises eleven ADMET datasets and 52,482 entries. This benchmark set is designed to serve as an open-source dataset for the development of AI models relevant to drug discovery projects.
AU - Niu,Z
AU - Xiao,X
AU - Wu,W
AU - Cai,Q
AU - Jiang,Y
AU - Jin,W
AU - Wang,M
AU - Yang,G
AU - Kong,L
AU - Jin,X
AU - Yang,G
AU - Chen,H
DO - 10.1038/s41597-024-03793-0
PY - 2024///
SN - 2052-4463
TI - PharmaBench: enhancing ADMET benchmarks with large language models
T2 - Scientific Data
UR - http://dx.doi.org/10.1038/s41597-024-03793-0
UR - https://www.nature.com/articles/s41597-024-03793-0
VL - 11
ER -

Contact


For enquiries about the MRI Physics Collective, please contact:

Mary Finnegan
Senior MR Physicist at the Imperial College Healthcare NHS Trust

Pete Lally
Assistant Professor in Magnetic Resonance (MR) Physics at Imperial College

Jan Sedlacik
MR Physicist at the Robert Steiner MR Unit, Hammersmith Hospital Campus