Imperial College London

ProfessorDavidHand

Faculty of Natural SciencesDepartment of Mathematics

Senior Research Investigator
 
 
 
//

Contact

 

+44 (0)20 7594 2843d.j.hand Website CV

 
 
//

Assistant

 

Mrs Louise Rowland +44 (0)20 7594 2843

 
//

Location

 

547Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

282 results found

De Veaux RD, Hand DJ, De Veaux RD, Hand DJet al., 2005, How to lie with bad data, STATISTICAL SCIENCE, Vol: 20, Pages: 231-238, ISSN: 0883-4237

JOURNAL ARTICLE

Hand DJ, Crowder MJ, Hand DJ, Crowder MJet al., 2005, Measuring customer quality in retail banking, STATISTICAL MODELLING, Vol: 5, Pages: 145-158, ISSN: 1471-082X

JOURNAL ARTICLE

Hand DJ, Hand DJ, 2005, Supervised classification and tunnel vision, APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, Vol: 21, Pages: 97-109, ISSN: 1524-1904

JOURNAL ARTICLE

Hand DJ, Hand DJ, Hand DJ, 2005, Good practice in retail credit scorecard assessment, Credit Rating and Scoring Models Conference, Publisher: PALGRAVE MACMILLAN LTD, Pages: 1109-1117, ISSN: 0160-5682

In retail banking, predictive statistical models called 'scorecards' are used to assign customers to classes, and hence to appropriate actions or interventions. Such assignments are made on the basis of whether a customer's predicted score is above or below a given threshold. The predictive power of such scorecards gradually deteriorates over time, so that performance needs to be monitored. Common performance measures used in the retail banking sector include the Gini coefficient, the Kolmogorov - Smirnov statistic, the mean difference, and the information value. However, all of these measures use irrelevant information about the magnitude of scores, and fail to use crucial information relating to numbers misclassified. The result is that such measures can sometimes be seriously misleading, resulting in poor quality decisions being made, and mistaken actions being taken. The weaknesses of these measures are illustrated. Performance measures not subject to these risks are defined, and simple numerical illustrations are given.

CONFERENCE PAPER

Hand DJ, Heard NA, Hand DJ, Heard NA, Hand DJ, Heard NA, Hand DJ, Heard NAet al., 2005, Finding groups in gene expression data, JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY, Vol: 2005, Pages: 215-225, ISSN: 1110-7243

The vast potential of the genomic insight offered by microarray technologies has led to their widespread use since they were introduced a decade ago. Application areas include gene function discovery, disease diagnosis, and inferring regulatory networks. Microarray experiments enable large-scale, high-throughput investigations of gene activity and have thus provided the data analyst with a distinctive, high-dimensional field of study. Many questions in this field relate to finding subgroups of data profiles which are very similar. A popular type of exploratory tool for finding subgroups is cluster analysis, and many different flavors of algorithms have been used and indeed tailored for microarray data. Cluster analysis, however, implies a partitioning of the entire data set, and this does not always match the objective. Sometimes pattern discovery or bump hunting tools are more appropriate. This paper reviews these various tools for finding interesting subgroups.

JOURNAL ARTICLE

Hand DJ, Sohn SY, Kim Y, HAND D, SOHN S, KIM Yet al., 2005, Optimal bipartite scorecards, EXPERT SYSTEMS WITH APPLICATIONS, Vol: 29, Pages: 684-690, ISSN: 0957-4174

JOURNAL ARTICLE

Hand D, 2005, Data analysis in personal financial services: a rich opportunity, Significance, Vol: 2, Pages: 110-113, ISSN: 1740-9705

JOURNAL ARTICLE

Hand DJ, 2005, Grade inflation, Kingston, Ontario, Statistics, science and public policy IX. Government, science and politics. Proceedings of the conference on statistics, science and public policy held at Herstmonceux Castle, Hailsham, UK, 21 - 24 April 2004, Publisher: Queen's University, Pages: 115-124

CONFERENCE PAPER

Hand DJ, 2005, Modern data analysis tools in personal financial services: a quantitative revolution?, N/A, Data mining et apprentissage statistique applications en assurance, Niort, France, 12 - 13 May 2005, Publisher: N/A, Pages: 1-8

CONFERENCE PAPER

Hand DJ, 2005, Data mining, Encyclopedia of statistics in behavioral science, Editors: Everitt, Howell, Publisher: Wiley, Pages: 461-465, ISBN: 9780470860809

BOOK CHAPTER

Hand DJ, 2005, Pattern recognition, Handbook of statistics, Editors: Rao, Wegman, Amsterdam, Publisher: Elsevier, Pages: 213-228, ISBN: 9780444511416

BOOK CHAPTER

Hand DJ, Adams NM, Heard NA, Hand DJ, Adams NM, Heard NA, Hand DJ, Adams NM, Heard NA, Hand DJ, Adams NM, Heard NA, Hand DJ, Adams NM, Heard NAet al., 2005, Pattern discovery tools for detecting cheating in student coursework, Berlin, Local pattern detection. International seminar. Dagstuhl Castle, Germany, 12 - 16 April 2004, Publisher: Springer-Verlag, Pages: 39-52, ISSN: 0302-9743

Students sometimes cheat. In particular, they sometimes copy coursework assignments from each other. Such copying is occasionally detected by the markers, since the copied script and the original will be unusually similar. However, one cannot rely on such subjective assessment - perhaps there axe many scripts or perhaps the student has sought to disguise the copying by changing words or other aspects of the answers. We describe an attempt to develop a pattern discovery method for detecting cheating, based on measures of the similarities between scripts, where similarity is defined in syntactic rather than semantic terms. This problem differs from many other pattern discovery problems because the peaks will typically be very low: normally only one or two cheating students will copy from any given other student.

CONFERENCE PAPER

Hand DJ, Hand DJ, 2005, Size matters-how measurement defines our world, Significance, Vol: 2, Pages: 81-83, ISSN: 1740-9705

JOURNAL ARTICLE

Hand, David J, Krzanowski, Wojtek J, Hand DJ, Krzanowski WJ, Hand DJ, Krzanowski WJ, Hand DJ, Krzanowski WJet al., 2005, Optimising k-means clustering results with standard software packages, Computational Statistics & Data Analysis, Vol: 49, Pages: 969-973, ISSN: 0167-9473

JOURNAL ARTICLE

Heard NA, Holmes CC, Stephens DA, Hand DJ, Dimopoulos G, Heard NA, Holmes CC, Stephens DA, Hand DJ, Dimopoulos G, Heard NA, Holmes CC, Stephens DA, Hand DJ, Dimopoulos G, Heard NA, Holmes CC, Stephens DA, Hand DJ, Dimopoulos Get al., 2005, Bayesian coclustering of Anopheles gene expression time series: Study of immune defense response to multiple experimental challenges, PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, Vol: 102, Pages: 16939-16944, ISSN: 0027-8424

We present a method for Bayesian model-based hierarchical coclustering of gene expression data and use it to study the temporal transcription responses of an Anopheles gambiae cell line upon challenge with multiple microbial elicitors. The method fits statistical regression models to the gene expression time series for each experiment and performs coclustering on the genes by optimizing a joint probability model, characterizing gene coregulation between multiple experiments. We compute the model using a two-stage Expectation-Maximization-type algorithm, first fixing the cross-experiment covariance structure and using efficient Bayesian hierarchical clustering to obtain a locally optimal clustering of the gene expression profiles and then, conditional on that clustering, carrying out Bayesian inference on the cross-experiment covariance using Markov chain Monte Carlo simulation to obtain an expectation. For the problem of model choice, we use a cross-validatory approach to decide between individual experiment modeling and varying levels of coclustering. Our method successfully generates tightly coregulated clusters of genes that are implicated in related processes and therefore can be used for analysis of global transcript responses to various stimuli and prediction of gene functions.

JOURNAL ARTICLE

Jamain A, Hand DJ, Jamain A, Hand DJet al., 2005, The Naive Bayes Mystery: a classification detective story, PATTERN RECOGNITION LETTERS, Vol: 26, Pages: 1752-1760, ISSN: 0167-8655

JOURNAL ARTICLE

King MD, Crowder MJ, Hand DJ, Harris NG, Williams SR, Obrenovitch TP, Gadian DG, King MD, Crowder MJ, Hand DJ, Harris NG, Williams SR, Obrenovitch TP, Gadian DG, King MD, Crowder MJ, Hand DJ, Harris NG, Williams SR, Obrenovitch TP, Gadian DG, King MD, Crowder MJ, Hand DJ, Harris NG, Williams SR, Obrenovitch TP, Gadian DGet al., 2005, Is anoxic depolarisation associated with an ADC threshold? A Markov chain Monte Carlo analysis, NMR IN BIOMEDICINE, Vol: 18, Pages: 587-594, ISSN: 0952-3480

A Bayesian nonlinear hierarchical random coefficients model was used in a reanalysis of a previously published longitudinal study of the extracellular direct current (DC)-potential and apparent diffusion coefficient (ADC) responses to focal ischaemia. The main purpose was to examine the data for evidence of an ADC threshold for anoxic depolarisation. A Markov chain Monte Carlo simulation approach was adopted. The Metropolis algorithm was used to generate three parallel Markov chains and thus obtain a sampled posterior probability distribution for each of the DC-potential and ADC model parameters, together with a number of derived parameters. The latter were used in a subsequent threshold analysis. The analysis provided no evidence indicating a consistent and reproducible ADC threshold for anoxic depolarisation.

JOURNAL ARTICLE

Thomas LC, Oliver RW, Hand DJ, Thomas LC, Oliver RW, Hand DJet al., 2005, A survey of the issues in consumer credit modelling research, JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, Vol: 56, Pages: 1006-1015, ISSN: 0160-5682

JOURNAL ARTICLE

Zhang Z, Hand DJ, Zhang ZC, Hand DJ, Zhang ZC, Hand DJet al., 2005, Detecting groups of anomalously similar objects in large data sets, Berlin, Advances in Intelligent Data Analysis; IDA 2005, 6th International symposium; Madrid, Publisher: Springer, Pages: 509-519, ISSN: 0302-9743

Pattern discovery is a facet of data mining concerned with the detection of "small local" structures in large data sets. In high dimensions this is typically difficult because of the computational work involved in searching over the data space. In this paper we outline a tool called PEAKER which can detect patterns efficiently in high dimensions. We approach the subject through the two aspects of pattern discovery, detection and verification. We demonstrate various ways of using PEAKER as well as its various inherent properties, emphasizing the exploratory nature of the tool.

CONFERENCE PAPER

Adams NM, Crowder MJ, Hand DJ, Stephens DA, Adams NM, Crowder MJ, Hand DJ, Stephens DA, Adams NM, Crowder MJ, Hand DJ, Stephens DAet al., 2004, Methods and models in statistics: in honour of Professor John Nelder, FRS, London, Publisher: Imperial College Press, ISBN: 9781860944635

BOOK

Benton T, Hand D, Crowder M, Benton T, Hand D, Crowder Met al., 2004, Two zs are better than one, JOURNAL OF APPLIED STATISTICS, Vol: 31, Pages: 239-247, ISSN: 0266-4763

JOURNAL ARTICLE

Bolton RJ, Hand DJ, Crowder M, Bolton RJ, Hand DJ, Crowder M, Bolton RJ, Hand DJ, Crowder Met al., 2004, Significance tests for unsupervised pattern discovery in large continuous multivariate data sets, COMPUTATIONAL STATISTICS & DATA ANALYSIS, Vol: 46, Pages: 57-79, ISSN: 0167-9473

JOURNAL ARTICLE

Hand DJ, 2004, Deconstructing Statistical Questions

Too much current statistical work takes a superficial view of the client's research question, adopting techniques which have a solid history, a sound mathematical basis or readily available software, but without considering in depth whether the questions being answered are in fact those which should be asked. Examples, some familiar and others less so, are given to illustrate this assertion. It is clear that establishing the mapping from the client's domain to a statistical question is one of the most difficult parts of a statistical analysis. It is a part in which the responsibility is shared by both client and statistician. A plea is made for more research effort to go in this direction and some suggestions are made for ways to tackle the problem.

SCHOLARLY EDITION

Hand DJ, Bolton RJ, Hand DJ, Bolton RJet al., 2004, Pattern discovery and detection: A unified statistical methodology, JOURNAL OF APPLIED STATISTICS, Vol: 31, Pages: 885-924, ISSN: 0266-4763

JOURNAL ARTICLE

Hand DJ, Glasbey C, Husmeier D, Gower JC, van Houwelingen HC, Bugrien JB, Nason G, Critchley F, Hoff PD, McLachlan GJ, Bean RWet al., 2004, Clustering objects on subsets of attributes - Discussion, JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, Vol: 66, Pages: 839-849, ISSN: 1369-7412

JOURNAL ARTICLE

Hand DJ, Hand DJ, Hand DJ, 2004, Strength in diversity: The advance of data analysis, Berlin, 15th European Conference on Machine Learning/8th European Conference on Principles and Practice of Knowledge Discovery in Databases, Publisher: SPRINGER-VERLAG BERLIN, Pages: 18-26, ISSN: 0302-9743

The scientific analysis of data is only around a century old. For most of that century, data analysis was the realm of only one discipline - statistics. As a consequence of the development of the computer, things have changed dramatically and now there are several such disciplines, including machine learning, pattern recognition, and data mining. This paper looks at some of the similarities and some of the differences between these disciplines, noting where they intersect and, perhaps of more interest, where they do not. Particular issues examined include the nature of the data with which they are concerned, the role of mathematics, differences in the objectives, how the different areas of application have led to different aims, and how the different disciplines have led sometimes to the same analytic tools being developed, but also sometimes to different tools being developed. Some conjectures about likely future developments are given.

CONFERENCE PAPER

Hand DJ, 2004, Propose vote of thanks for 'Clustering objects on subsets of attributes' by J.H.Friedman and J.J.Meulman (Review), Journal of the Royal Statistical Society Series B - Statistical Methodology, Vol: 66, Pages: 839-840, ISSN: 1369-7412

JOURNAL ARTICLE

Hand DJ, 2004, Credit scoring, Encyclopedia of actuarial science, Editors: Teugels, Sundt, Chichester, Publisher: Wiley, Pages: 2-14, ISBN: 9780470846766

BOOK CHAPTER

Hand DJ, 2004, Crime, statistics, and behaviour, Kingston, Ontario, Statistics, science and public policy VIII : science, ethics and the law: proceedings of the conference on statistics, science and public policy held at Herstmonceux Castle, Hailsham, U.K., 23 - 26 April 2003, Publisher: Queen's University, Pages: 181-187

CONFERENCE PAPER

Hand DJ, 2004, Measurement theory and practice : the world through quantification, London, Publisher: Arnold, ISBN: 9780340677834

BOOK

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: id=00171082&limit=30&person=true&page=6&respub-action=search.html