Imperial College London

Dr Martine Nurek

Faculty of MedicineDepartment of Surgery & Cancer

Honorary Research Associate



+44 (0)20 7594 3062m.nurek




504Medical SchoolSt Mary's Campus





Publication Type

11 results found

Nurek M, Kostopoulou O, 2023, How the UK public views the use of diagnostic decision aids by physicians: a vignette-based experiment, Journal of the American Medical Informatics Association, Vol: 30, Pages: 888-898, ISSN: 1067-5027

Objective:Physicians’ low adoption of diagnostic decision aids (DDAs) may be partially due to concerns about patient/public perceptions. We investigated how the UK public views DDA use and factors affecting perceptions.Materials and Methods:In this online experiment, 730 UK adults were asked to imagine attending a medical appointment where the doctor used a computerized DDA. The DDA recommended a test to rule out serious disease. We varied the test’s invasiveness, the doctor’s adherence to DDA advice, and the severity of the patient’s disease. Before disease severity was revealed, respondents indicated how worried they felt. Both before [t1] and after [t2] severity was revealed, we measured satisfaction with the consultation, likelihood of recommending the doctor, and suggested frequency of DDA use.Results:At both timepoints, satisfaction and likelihood of recommending the doctor increased when the doctor adhered to DDA advice (P ≤ .01), and when the DDA suggested an invasive versus noninvasive test (P ≤ .05). The effect of adherence to DDA advice was stronger when participants were worried (P ≤ .05), and the disease turned out to be serious (P ≤ .01). Most respondents felt that DDAs should be used by doctors “sparingly” (34%[t1]/29%[t2]), “frequently,” (43%[t1]/43%[t2]) or “always” (17%[t1]/21%[t2]).Discussion:People are more satisfied when doctors adhere to DDA advice, especially when worried, and when it helps to spot serious disease. Having to undergo an invasive test does not appear to dampen satisfaction.Conclusion:Positive attitudes regarding DDA use and satisfaction with doctors adhering to DDA advice could encourage greater use of DDAs in consultations.

Journal article

Nurek M, Hay AD, Kostopoulou O, 2023, Online experiment comparing GPs’ antibiotic prescribing decisions to a clinical prediction rule, British Journal of General Practice, Vol: 73, Pages: e176-e185, ISSN: 0960-1643

Background: The“STARWAVe” clinical prediction rule (CPR) uses seven factors toguide risk assessment and antibiotic prescribing in children with cough (Short illnessduration, Temperature, Age, Recession, Wheeze, Asthma, Vomiting).Aim: To assess the influence of STARWAVe factors on General Practitioners’ (GPs)unaided risk assessments and prescribing decisions. We also explored two methodsof obtaining risk assessments and tested the impact of parental concern.Design and setting: Experiment comprising clinical vignettes administered to 188 UKGPs online.Method: GPs were randomly assigned to view 32 (of 64) vignettes depicting childrenwith cough. Vignettes varied the STARWAVe factors systematically. Per vignette, GPsassessed risk of deterioration in one of two ways (sliding scale vs. risk categoryselection) and indicated whether they would prescribe antibiotics. Finally, they saw anadditional vignette, suggesting that the parent was concerned. Using mixed-effectsregressions, we measured the influence of STARWAVe factors, risk elicitationmethod, and parental concern on GPs' assessments and decisions.Results: Six STARWAVe risk factors correctly increased GPs’ risk assessments(bssliding-scale0.66, ORscategory-selection1.61, ps0.001) while one incorrectly reducedthem (short duration: bsliding-scale=-0.31, ORcategory-selection=0.75, ps0.039). Conversely,one STARWAVe factor increased prescribing odds (fever: OR=5.22, p<0.001) whilethe rest either reduced them (short duration, age, recession: ORs0.70, ps<0.001) orhad no significant impact (wheeze, asthma, vomiting: ps0.065). Parental concernincreased risk assessments (bsliding-scale=1.29, ORcategory-selection=2.82, ps0.003) butnot prescribing (p=0.378).Conclusion: GPs use some, but not all, STARWAVe factors when making unaidedrisk assessments and prescribing decisions. Such discrepancies must be consideredwhen introducing CPRs to clinical practice.

Journal article

Kourtidis P, Nurek M, Delaney B, Kostopoulou Oet al., 2022, Influences of early diagnostic suggestions on clinical reasoning, Cognitive Research: Principles and Implications, Vol: 7, ISSN: 2365-7464

Previous research has highlighted the importance of physicians’ early hypotheses for their subsequent diagnostic decisions. It has also been shown that diagnostic accuracy improves when physicians are presented with a list of diagnostic suggestions to consider at the start of the clinical encounter. The psychological mechanisms underlying this improvement in accuracy are hypothesised. It is possible that the provision of diagnostic suggestions disrupts physicians’ intuitive thinking and reduces their certainty in their initial diagnostic hypotheses. This may encourage them to seek more information before reaching a diagnostic conclusion, evaluate this information more objectively, and be more open to changing their initial hypotheses. Three online experiments explored the effects of early diagnostic suggestions, provided by a hypothetical decision aid, on different aspects of the diagnostic reasoning process. Family physicians assessed up to two patient scenarios with and without suggestions. We measured effects on certainty about the initial diagnosis, information search and evaluation, and frequency of diagnostic changes. We did not find a clear and consistent effect of suggestions and detected mainly non-significant trends, some in the expected direction. We also detected a potential biasing effect: when the most likely diagnosis was included in the list of suggestions (vs. not included), physicians who gave that diagnosis initially, tended to request less information, evaluate it as more supportive of their diagnosis, become more certain about it, and change it less frequently when encountering new but ambiguous information; in other words, they seemed to validate rather than question their initial hypothesis. We conclude that further research using different methodologies and more realistic experimental situations is required to uncover both the beneficial and biasing effects of early diagnostic suggestions.

Journal article

Delaney BC, Rayner C, Freyer A, Taylor S, Jaerte L, MacDermott N, Nurek Met al., 2022, Recommendations for the recognition, diagnosis, and management of long COVID response, British Journal of General Practice, Vol: 72, Pages: 259-260, ISSN: 0960-1643

Journal article

Nurek M, Rayner C, Freyer A, Taylor S, Jaerte L, MacDermott N, Delaney BCet al., 2021, Recommendations for the recognition, diagnosis, and management of long COVID: a Delphi study, British Journal of General Practice, Vol: 71, Pages: E815-E825, ISSN: 0960-1643

Background In the absence of research into therapies and care pathways for long COVID, guidance based on ‘emerging experience’ is needed.Aim To provide a rapid expert guide for GPs and long COVID clinical services.Design and setting A Delphi study was conducted with a panel of primary and secondary care doctors.Method Recommendations were generated relating to the investigation and management of long COVID. These were distributed online to a panel of UK doctors (any specialty) with an interest in, lived experience of, and/or experience treating long COVID. Over two rounds of Delphi testing, panellists indicated their agreement with each recommendation (using a five-point Likert scale) and provided comments. Recommendations eliciting a response of ‘strongly agree’, ‘agree’, or ‘neither agree nor disagree’ from 90% or more of responders were taken as showing consensus.Results Thirty-three clinicians representing 14 specialties reached consensus on 35 recommendations. Chiefly, GPs should consider long COVID in the presence of a wide range of presenting features (not limited to fatigue and breathlessness) and exclude differential diagnoses where appropriate. Detailed history and examination with baseline investigations should be conducted in primary care. Indications for further investigation and specific therapies (for myocarditis, postural tachycardia syndrome, mast cell disorder) include hypoxia/desaturation, chest pain, palpitations, and histamine-related symptoms. Rehabilitation should be individualised, with careful activity pacing (to avoid relapse) and multidisciplinary support.Conclusion Long COVID clinics should operate as part of an integrated care system, with GPs playing a key role in the multidisciplinary team. Holistic care pathways, investigation of specific complications, management of potential symptom clusters, and tailored rehabilitation are needed.

Journal article

Kostopoulou O, Nurek M, Delaney B, 2020, Disentangling the relationship between physician and organizational performance: a signal detection approach, Medical Decision Making, Vol: 40, Pages: 746-755, ISSN: 0272-989X

Background. In previous research, we employed a signal detection approach to measure the performance of general practitioners (GPs) when deciding about urgent referral for suspected lung cancer. We also explored associations between provider and organizational performance. We found that GPs from practices with higher referral positive predictive value (PPV; chance of referrals identifying cancer) were more reluctant to refer than those from practices with lower PPV. Here, we test the generalizability of our findings to a different cancer. Methods. A total of 252 GPs responded to 48 vignettes describing patients with possible colorectal cancer. For each vignette, respondents decided whether urgent referral to a specialist was needed. They then completed the 8-item Stress from Uncertainty scale. We measured GPs’ discrimination (d′) and response bias (criterion; c) and their associations with organizational performance and GP demographics. We also measured correlations of d′ and c between the 2 studies for the 165 GPs who participated in both. Results. As in the lung study, organizational PPV was associated with response bias: in practices with higher PPV, GPs had higher criterion (b = 0.05 [0.03 to 0.07]; P < 0.001), that is, they were less inclined to refer. As in the lung study, female GPs were more inclined to refer than males (b = −0.17 [−0.30 to −0.105]; P = 0.005). In a mediation model, stress from uncertainty did not explain the gender difference. Only response bias correlated between the 2 studies (r = 0.39, P < 0.001). Conclusions. This study confirms our previous findings regarding the relationship between provider and organizational performance and strengthens the finding of gender differences in referral decision making. It also provides evidence that response bias is a relatively stable feature of GP referral decision making.

Journal article

Nurek M, Delaney BC, Kostopoulou O, 2020, Risk assessment and antibiotic prescribing decisions in children presenting to UK primary care with cough: a vignette study, BMJ Open, Vol: 10, ISSN: 2044-6055

Objectives: The validated “STARWAVe” clinical prediction rule (CPR) uses seven variables to guide risk assessment and antimicrobial stewardship in children presenting with cough(Short illness duration, Temperature, Age, Recession, Wheeze, Asthma,Vomiting). We aimed to compare General Practitioners’ (GPs) risk assessments and prescribing decisions to those of STARWAVe, and assess the influence of the CPR’s clinical variables. Setting: Primary care. Participants: 252 GPs, currently practising in the UK. Design: GPs were randomly assigned to view four (of a possible eight) clinical vignettes online. Each vignette depicted a child presenting with cough, who was described in terms of the seven STARWAVe variables. Systematically, we manipulated patient age (20 months vs. 5 years), illness duration (3 vs. 6 days),vomiting (present vs. absent) and wheeze (present vs. absent), holding the remaining STARWAVe variables constant. Outcome measures:Per vignette, GPs assessed risk of hospitalisation and indicated whether they would prescribe antibiotics or not. Results: GPs overestimated risk of hospitalisationin 9% of vignette presentations (88/1008) and underestimated it in 46% (459/1008). Despite underestimating risk, they overprescribed: 78% of prescriptions were unnecessary relative to GPs’ own risk assessments (121/156), while 83% were unnecessary relativeto STARWAVe risk assessments (130/156). All four of the manipulated variables influenced risk assessments, but only three influenced prescribing decisions: a shorter illness duration reduced prescribing odds (OR 0.14, 95% CI 0.08-0.27, p<0.001), while vomiting and wheeze increased them (ORvomit2.17, 95% CI 1.32-3.57, p=0.002; ORwheeze8.98, 95% CI 4.99-16.15, p<0.001). Conclusions: Relative to STARWAVe, GPs underestimated riskof hospitalisation, overprescribed, and appeared to

Journal article

Kostopoulou O, Nurek M, Cantarella S, Okoli G, Fiorentino F, Delaney Bet al., 2019, Referral decision making of General Practitioners: a signal detection study, Medical Decision Making, Vol: 39, Pages: 21-31, ISSN: 0272-989X

Background. Signal detection theory (SDT) describes how respondents categorize ambiguous stimuli over repeated trials. It measures separately “discrimination” (ability to recognize a signal amid noise) and “criterion” (inclination to respond “signal” v. “noise”). This is important because respondents may produce the same accuracy rate for different reasons. We employed SDT to measure the referral decision making of general practitioners (GPs) in cases of possible lung cancer. Methods. We constructed 44 vignettes of patients for whom lung cancer could be considered and estimated their 1-year risk. Under UK risk-based guidelines, half of the vignettes required urgent referral. We recruited 216 GPs from practices across England. Practices differed in the positive predictive value (PPV) of their urgent referrals (chance of referrals identifying cancer) and the sensitivity (chance of cancer patients being picked up via urgent referral from their practice). Participants saw the vignettes online and indicated whether they would refer each patient urgently or not. We calculated each GP’s discrimination (d ′) and criterion (c) and regressed these on practice PPV and sensitivity, as well as on GP experience and gender. Results. Criterion was associated with practice PPV: as PPV increased, GPs’c also increased, indicating lower inclination to refer (b = 0.06 [0.02–0.09]; P = 0.001). Female GPs were more inclined to refer than male GPs (b = −0.20 [−0.40 to −0.001]; P = 0.049). Average discrimination was modest (d′ = 0.77), highly variable (range, −0.28 to 1.91), and not associated with practice referral performance. Conclusions. High referral PPV at the organizational level indicates GPs’ inclination to avoid false positives, not better discrimination. Rather than bluntly mandating increases in practice PPV via more referrals, it is necessary to increase discrimina

Journal article

Nurek M, Kostopoulou O, 2016, What You Find Depends on How You Measure It: Reactivity of Response Scales Measuring Predecisional Information Distortion in Medical Diagnosis, PLOS One, Vol: 11, ISSN: 1932-6203

“Predecisional information distortion” occurs when decision makers evaluate new information in a way that is biased towards their leading option. The phenomenon is well established, as is the method typically used to measure it, termed “stepwise evolution of preference” (SEP). An inadequacy of this method has recently come to the fore: it measures distortion as the total advantage afforded a leading option over its competitor, and therefore it cannot differentiate between distortion to strengthen a leading option (“proleader” distortion) and distortion to weaken a trailing option (“antitrailer” distortion). To address this, recent research introduced new response scales to SEP. We explore whether and how these new response scales might influence the very proleader and antitrailer processes that they were designed to capture (“reactivity”). We used the SEP method with concurrent verbal reporting: fifty family physicians verbalized their thoughts as they evaluated patient symptoms and signs (“cues”) in relation to two competing diagnostic hypotheses. Twenty-five physicians evaluated each cue using the response scale traditional to SEP (a single response scale, returning a single measure of distortion); the other twenty-five did so using the response scales introduced in recent studies (two separate response scales, returning two separate measures of distortion: proleader and antitrailer). We measured proleader and antitrailer processes in verbalizations, and compared verbalizations in the single-scale and separate-scales groups. Response scales did not appear to affect proleader processes: the two groups of physicians were equally likely to bolster their leading diagnosis verbally. Response scales did, however, appear to affect antitrailer processes: the two groups denigrated their trailing diagnosis verbally to differing degrees. Our findings suggest that the response scales used to measure infor

Journal article

Nurek M, Kostopoulou O, Delaney BC, Esmail Aet al., 2015, Reducing diagnostic errors in primary care. A systematic meta-review of computerized diagnostic decision support systems by the LINNEAUS collaboration on patient safety in primary care, European Journal of General Practice, Vol: 21, Pages: 8-13, ISSN: 1751-1402

BACKGROUND: Computerized diagnostic decision support systems (CDDSS) have the potential to support the cognitive task of diagnosis, which is one of the areas where general practitioners have greatest difficulty and which accounts for a significant proportion of adverse events recorded in the primary care setting. OBJECTIVE: To determine the extent to which CDDSS may meet the requirements of supporting the cognitive task of diagnosis, and the currently perceived barriers that prevent the integration of CDDSS with electronic health record (EHR) systems. METHODS: We conducted a meta-review of existing systematic reviews published in English, searching MEDLINE, Embase, PsycINFO and Web of Knowledge for articles on the features and effectiveness of CDDSS for medical diagnosis published since 2004. Eligibility criteria included systematic reviews where individual clinicians were primary end users. Outcomes we were interested in were the effectiveness and identification of specific features of CDDSS on diagnostic performance. RESULTS: We identified 1970 studies and excluded 1938 because they did not fit our inclusion criteria. A total of 45 articles were identified and 12 were found suitable for meta-review. Extraction of high-level requirements identified that a more standardized computable approach is needed to knowledge representation, one that can be readily updated as new knowledge is gained. In addition, a deep integration with the EHR is needed in order to trigger at appropriate points in cognitive workflow. CONCLUSION: Developing a CDDSS that is able to utilize dynamic vocabulary tools to quickly capture and code relevant diagnostic findings, and coupling these with individualized diagnostic suggestions based on the best-available evidence has the potential to improve diagnostic accuracy, but requires evaluation.

Journal article

Nurek M, Kostopoulou O, Hagmayer Y, 2014, Predecisional information distortion in physicians’ diagnostic judgments: Strengthening a leading hypothesis or weakening its competitor?, Judgment and Decision Making, Vol: 9, Pages: 572-585, ISSN: 1930-2975

Decision makers have been found to bias their interpretation of incoming information to support an emerging judgment (predecisional information distortion). This is a robust finding in human judgment, and was recently also established and measured in physicians’ diagnostic judgments (Kostopoulou et al. 2012). The two studies reported here extend this work by addressing the constituent modes of distortion in physicians. Specifically, we studied whether and to what extent physicians distort information to strengthen their leading diagnosis and/or to weaken a competing diagnosis. We used the “stepwise evolution of preference” method with three clinical scenarios, and measured distortion on separate rating scales, one for each of the two competing diagnoses per scenario.In Study 1, distortion in an experimental group was measured against the responses of a separate control group. In Study 2, distortion in a new experimental group was measured against participants’ own, personal responses provided under control conditions, with the two response conditions separated by a month. The two studies produced consistent results. On average, we found considerable distortion of information to weaken the trailing diagnosis but little distortion to strengthen the leading diagnosis. We also found individual differences in the tendency to engage in either mode of distortion. Given that two recent studies found both modes of distortion in lay preference (Blanchard, Carlson & Meloy, 2014; DeKay, Miller, Schley & Erford, 2014), we suggest that predecisional information distortion is affected by participant and task characteristics. Our findings contribute to the growing research on the different modes of predecisional distortion and their stability to methodological variation.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00904486&limit=30&person=true