Imperial College London

ProfessorRobertGlen

Faculty of MedicineDepartment of Metabolism, Digestion and Reproduction

Chair in Computational Medicine
 
 
 
//

Contact

 

+44 (0)20 7594 7912r.glen Website

 
 
//

Location

 

362Sir Alexander Fleming BuildingSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@article{Bender:2005:10.1021/ci0500177,
author = {Bender, A and Glen, RC},
doi = {10.1021/ci0500177},
journal = {J Chem Inf Model},
pages = {1369--1375},
title = {A discussion of measures of enrichment in virtual screening: comparing the information content of descriptors with increasing levels of sophistication.},
url = {http://dx.doi.org/10.1021/ci0500177},
volume = {45},
year = {2005}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - We have performed virtual screening using some very simple features, by employing the number of atoms per element as molecular descriptors but without regard to any structural information whatsoever. Surprisingly, these atom counts are able to outperform virtual-affinity-based fingerprints and Unity fingerprints in some activity classes. Although molecular weight and other biases were known in target-based virtual screening settings (docking), we report the effect of using very simple descriptors for ligand-based virtual screening, by using clearly defined biological targets and employing a large data set (>100,000 compounds) containing multiple (11) activity classes. Structure-unaware atom count vectors as descriptors in combination with the Euclidean distance measure are able to achieve "enrichment factors" over random selection of around 4 (depending on the particular class of active compounds), putting the enrichment factors reported for more sophisticated virtual screening methods in a different light. They are also able to retrieve active compounds with novel scaffolds instead of merely the expected structural analogues. The added value of many currently used virtual screening methods (calculated as enrichment factors) drops down to a factor of between 1 and 2, instead of often reported double-digit figures. The observed effect is much less profound for simple descriptors such as molecular weight and is only present in cases of atypical (larger) ligands. The current state of virtual screening is not as sophisticated as might be expected, which is due to descriptors still not being able to capture structural properties relevant to binding. This fact can partly be explained by highly nonlinear structure-activity relationships, which represent a severe limitation of the "similar property principle" in the context of bioactivity.
AU - Bender,A
AU - Glen,RC
DO - 10.1021/ci0500177
EP - 1375
PY - 2005///
SN - 1549-9596
SP - 1369
TI - A discussion of measures of enrichment in virtual screening: comparing the information content of descriptors with increasing levels of sophistication.
T2 - J Chem Inf Model
UR - http://dx.doi.org/10.1021/ci0500177
UR - https://www.ncbi.nlm.nih.gov/pubmed/16180913
VL - 45
ER -