Imperial College London

Dr James Kinross

Faculty of MedicineDepartment of Surgery & Cancer

Reader in General Surgery
 
 
 
//

Contact

 

+44 (0)20 3312 1947j.kinross

 
 
//

Location

 

1029Queen Elizabeth the Queen Mother Wing (QEQM)St Mary's Campus

//

Summary

 

Publications

Citation

BibTex format

@article{Vaghela:2021:10.2196/preprints.25714,
author = {Vaghela, U and Rabinowicz, S and Bratsos, P and Martin, G and Fritzilas, E and Markar, S and Purkayastha, S and Stringer, K and Singh, H and Llewellyn, C and Dutta, D and Clarke, JM and Howard, M and Serban, O and Kinross, J},
doi = {10.2196/preprints.25714},
journal = {Journal of Medical Internet Research},
pages = {1--14},
title = {Using a secure, continually updating, web source processing pipeline to support the real-time data synthesis and analysis of scientific literature: development and validation study},
url = {http://dx.doi.org/10.2196/preprints.25714},
volume = {23},
year = {2021}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - Background:The scale and quality of the global scientific response to the COVID-19 pandemic have unquestionably saved lives. However, the COVID-19 pandemic has also triggered an unprecedented “infodemic”; the velocity and volume of data production have overwhelmed many key stakeholders such as clinicians and policy makers, as they have been unable to process structured and unstructured data for evidence-based decision making. Solutions that aim to alleviate this data synthesis–related challenge are unable to capture heterogeneous web data in real time for the production of concomitant answers and are not based on the high-quality information in responses to a free-text query.Objective:The main objective of this project is to build a generic, real-time, continuously updating curation platform that can support the data synthesis and analysis of a scientific literature framework. Our secondary objective is to validate this platform and the curation methodology for COVID-19–related medical literature by expanding the COVID-19 Open Research Dataset via the addition of new, unstructured data.Methods:To create an infrastructure that addresses our objectives, the PanSurg Collaborative at Imperial College London has developed a unique data pipeline based on a web crawler extraction methodology. This data pipeline uses a novel curation methodology that adopts a human-in-the-loop approach for the characterization of quality, relevance, and key evidence across a range of scientific literature sources.Results:REDASA (Realtime Data Synthesis and Analysis) is now one of the world’s largest and most up-to-date sources of COVID-19–related evidence; it consists of 104,000 documents. By capturing curators’ critical appraisal methodologies through the discrete labeling and rating of information, REDASA rapidly developed a foundational, pooled, data science data set of over 1400 articles in under 2 weeks. These articles provide COVID-19–re
AU - Vaghela,U
AU - Rabinowicz,S
AU - Bratsos,P
AU - Martin,G
AU - Fritzilas,E
AU - Markar,S
AU - Purkayastha,S
AU - Stringer,K
AU - Singh,H
AU - Llewellyn,C
AU - Dutta,D
AU - Clarke,JM
AU - Howard,M
AU - Serban,O
AU - Kinross,J
DO - 10.2196/preprints.25714
EP - 14
PY - 2021///
SN - 1438-8871
SP - 1
TI - Using a secure, continually updating, web source processing pipeline to support the real-time data synthesis and analysis of scientific literature: development and validation study
T2 - Journal of Medical Internet Research
UR - http://dx.doi.org/10.2196/preprints.25714
UR - http://doi.org/10.2196/preprints.25714
UR - http://hdl.handle.net/10044/1/89343
VL - 23
ER -