Imperial College London

Dr James A Bull

Faculty of Natural SciencesDepartment of Chemistry

Reader in Synthetic Chemistry
 
 
 
//

Contact

 

+44 (0)20 7594 5811j.bull Website

 
 
//

Location

 

501bMolecular Sciences Research HubWhite City Campus

//

Summary

 

Publications

Citation

BibTex format

@article{Brown:2019:database/baz085,
author = {Brown, P and Tan, A-C and El-Esawi, MA and Liehr, T and Blanck, O and Gladue, DP and Almeida, GMF and Cernava, T and Sorzano, CO and Yeung, AWK and Engel, MS and Chandrasekaran, AR and Muth, T and Staege, MS and Daulatabad, SV and Widera, D and Zhang, J and Meule, A and Honjo, K and Pourret, O and Yin, C-C and Zhang, Z and Cascella, M and Flegel, WA and Goodyear, CS and van, Raaij MJ and Bukowy-Bieryllo, Z and Campana, LG and Kurniawan, NA and Lalaouna, D and Huttner, FJ and Ammerman, BA and Ehret, F and Cobine, PA and Tan, E-C and Han, H and Xia, W and McCrum, C and Dings, RPM and Marinello, F and Nilsson, H and Nixon, B and Voskarides, K and Yang, L and Costa, VD and Bengtsson-Palme, J and Bradshaw, W and Grimm, DG and Kumar, N and Martis, E and Prieto, D and Sabnis, SC and Amer, SEDR and Liew, AWC and Perco, P and Rahimi, F and Riva, G and Zhang, C and Devkota, HP and Ogami, K and Basharat, Z and Fierz, W and Siebers, R and Tan, K-H and Boehme, KA and Brenneisen, P and Brown, JAL an},
doi = {database/baz085},
journal = {Database: the journal of biological databases and curation},
pages = {1--66},
title = {Large expert-curated database for benchmarking document similarity detection in biomedical literature search},
url = {http://dx.doi.org/10.1093/database/baz085},
volume = {2019},
year = {2019}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency–Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.
AU - Brown,P
AU - Tan,A-C
AU - El-Esawi,MA
AU - Liehr,T
AU - Blanck,O
AU - Gladue,DP
AU - Almeida,GMF
AU - Cernava,T
AU - Sorzano,CO
AU - Yeung,AWK
AU - Engel,MS
AU - Chandrasekaran,AR
AU - Muth,T
AU - Staege,MS
AU - Daulatabad,SV
AU - Widera,D
AU - Zhang,J
AU - Meule,A
AU - Honjo,K
AU - Pourret,O
AU - Yin,C-C
AU - Zhang,Z
AU - Cascella,M
AU - Flegel,WA
AU - Goodyear,CS
AU - van,Raaij MJ
AU - Bukowy-Bieryllo,Z
AU - Campana,LG
AU - Kurniawan,NA
AU - Lalaouna,D
AU - Huttner,FJ
AU - Ammerman,BA
AU - Ehret,F
AU - Cobine,PA
AU - Tan,E-C
AU - Han,H
AU - Xia,W
AU - McCrum,C
AU - Dings,RPM
AU - Marinello,F
AU - Nilsson,H
AU - Nixon,B
AU - Voskarides,K
AU - Yang,L
AU - Costa,VD
AU - Bengtsson-Palme,J
AU - Bradshaw,W
AU - Grimm,DG
AU - Kumar,N
AU - Martis,E
AU - Prieto,D
AU - Sabnis,SC
AU - Amer,SEDR
AU - Liew,AWC
AU - Perco,P
AU - Rahimi,F
AU - Riva,G
AU - Zhang,C
AU - Devkota,HP
AU - Ogami,K
AU - Basharat,Z
AU - Fierz,W
AU - Siebers,R
AU - Tan,K-H
AU - Boehme,KA
AU - Brenneisen,P
AU - Brown,JAL
AU - Dalrymple,BP
AU - Harvey,DJ
AU - Ng,G
AU - Werten,S
AU - Bleackley,M
AU - Dai,Z
AU - Dhariwal,R
AU - Gelfer,Y
AU - Hartmann,MD
AU - Miotla,P
AU - Tamaian,R
AU - Govender,P
AU - Gurney-Champion,OJ
AU - Kauppila,JH
AU - Zhang,X
AU - Echeverria,N
AU - Subhash,S
AU - Sallmon,H
AU - Tofani,M
AU - Bae,T
AU - Bosch,O
AU - Cuiv,PO
AU - Danchin,A
AU - Diouf,B
AU - Eerola,T
AU - Evangelou,E
AU - Filipp,FV
AU - Klump,H
AU - Kurgan,L
AU - Smith,SS
AU - Terrier,O
AU - Tuttle,N
AU - Ascher,DB
AU - Janga,SC
AU - Schulte,LN
AU - Becker,D
AU - Browngardt,C
AU - Bush,SJ
AU - Gaullier,G
AU - Ide,K
AU - Meseko,C
AU - Werner,GDA
AU - Zaucha,J
AU - Al-Farha,AA
AU - Greenwald,NF
AU - Popoola,SI
AU - Rahman,MS
AU - Xu,J
AU - Yang,SY
AU - Hiroi,N
AU - Alper,OM
AU - Baker,CI
AU - Bitzer,M
AU - Chacko,G
AU - Debrabant,B
AU - Dixon,R
AU - Forano,E
AU - Gilliham,M
AU - Kelly,S
AU - Klempnauer,K-H
AU - Lidbury,BA
AU - Lin,MZ
AU - Lynch,I
AU - Ma,W
AU - Maibach,EW
AU - Mather,DE
AU - Nandakumar,KS
AU - Ohgami,RS
AU - Parchi,P
AU - Tressoldi,P
AU - Xue,Y
AU - Armitage,C
AU - Barraud,P
AU - Chatzitheochari,S
AU - Coelho,LP
AU - Diao,J
AU - Doxey,AC
AU - Gobet,A
AU - Hu,P
AU - Kaiser,S
AU - Mitchell,KM
AU - Salama,MF
AU - Shabalin,IG
AU - Song,H
AU - Stevanovic,D
AU - Yadollahpour,A
AU - Zeng,E
AU - Zinke,K
AU - Alimba,CG
AU - Beyene,TJ
AU - Cao,Z
AU - Chan,SS
AU - Gatchell,M
AU - Kleppe,A
AU - Piotrowski,M
AU - Torga,G
AU - Woldesemayat,AA
AU - Cosacak,MI
AU - Haston,S
AU - Ross,SA
AU - Williams,R
AU - Wong,A
AU - Abramowitz,MK
AU - Effiong,A
AU - Lee,S
AU - Abid,MB
AU - Agarabi,C
AU - Alaux,C
AU - Albrecht,DR
AU - Atkins,GJ
AU - Beck,CR
AU - Bonvin,AMJJ
AU - Bourke,E
AU - Brand,T
AU - Braun,RJ
AU - Bull,JA
AU - Cardoso,P
AU - Carter,D
DO - database/baz085
EP - 66
PY - 2019///
SN - 1758-0463
SP - 1
TI - Large expert-curated database for benchmarking document similarity detection in biomedical literature search
T2 - Database: the journal of biological databases and curation
UR - http://dx.doi.org/10.1093/database/baz085
UR - http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=PARTNER_APP&SrcAuth=LinksAMR&KeyUT=WOS:000494411700001&DestLinkType=FullRecord&DestApp=ALL_WOS&UsrCustomerID=1ba7043ffcc86c417c072aa74d649202
UR - https://academic.oup.com/database/article/doi/10.1093/database/baz085/5608006
UR - http://hdl.handle.net/10044/1/77930
VL - 2019
ER -