Imperial College London

EUR ING Dr Edward A Meinert

Faculty of MedicineSchool of Public Health

Honorary Senior Lecturer
 
 
 
//

Contact

 

e.meinert14

 
 
//

Location

 

Reynolds BuildingCharing Cross Campus

//

Summary

 

Publications

Citation

BibTex format

@article{Surodina:2020:rs.3.rs-38387/v1,
author = {Surodina, S and Lam, C and Grbich, S and Milne-Ives, M and Velthoven, MV and Meinert, E},
doi = {rs.3.rs-38387/v1},
title = {Requirements Engineering of a Herpes Simplex Virus Patient Registry: Alpha Phase},
url = {http://dx.doi.org/10.21203/rs.3.rs-38387/v1},
year = {2020}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - <jats:title>Abstract</jats:title> <jats:p><jats:bold>Background</jats:bold> Collecting data from people with herpes simplex virus is challenging because of poor data quality, low user engagement, and concerns around stigma and anonymity. This project aimed to improve data collection for a real-world HSV registry by identifying predictors of HSV infection and selecting a limited number of relevant questions to ask new registry users in order to determine the HSV infection risk group. <jats:bold>Methods</jats:bold>. The US National Health and Nutrition Examination Survey (NHANES, 2015-16) database has confirmed HSV1 and HSV2 status of American participants (14-49 years) as well as a wealth of demographic and health-related data. Two datasets – for HSV1 and HSV2 – were formed using this database, and an anonymous lifestyle-data based questionnaire with a Random Forest algorithm was devised using Python. The algorithm was optimised to reduce the number of questions and to identify risk groups for HSV. Data was split into subsets to train and test the model. <jats:bold>Results </jats:bold>The model selected a reduced number of questions from the NHANES questionnaire that predicted HSV infection risk with high accuracy scores of 0.91 and 0.96 and high recall scores of 0.88 and 0.98 for HSV1 and HSV2 datasets, respectively. The number of questions was reduced from 150 to an average of 40, depending on age and gender, that together provides high predictability of the infection <jats:bold>Conclusions</jats:bold> This machine-learning algorithm for risk identification of people infected with HSV can be used in a real-world evidence registry to collect relevant lifestyle data. A current limitation is the absence of real user data and integration with electronic medical records that would enable model learning and improvement. Future work will explore model adjustments, anonymisation options, e
AU - Surodina,S
AU - Lam,C
AU - Grbich,S
AU - Milne-Ives,M
AU - Velthoven,MV
AU - Meinert,E
DO - rs.3.rs-38387/v1
PY - 2020///
TI - Requirements Engineering of a Herpes Simplex Virus Patient Registry: Alpha Phase
UR - http://dx.doi.org/10.21203/rs.3.rs-38387/v1
ER -