Imperial College London

ProfessorDavidHand

Faculty of Natural SciencesDepartment of Mathematics

Senior Research Investigator
 
 
 
//

Contact

 

+44 (0)20 7594 2843d.j.hand Website CV

 
 
//

Assistant

 

Mrs Louise Rowland +44 (0)20 7594 2843

 
//

Location

 

547Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

282 results found

Hand DJ, Hand DJ, Hand DJ, 2004, Strength in diversity: the advance of data analysis, Berlin, Knowledge discovery in databases: PKDD 2004: 8th European conference on principles and practice of knowledge discovery in databases, Pisa, Italy, 20 - 24 September 2004, Publisher: Springer-Verlag, Pages: 18-26, ISSN: 0302-9743

The scientific analysis of data is only around a century old. For most of that century, data analysis was the realm of only one discipline - statistics. As a consequence of the development of the computer, things have changed dramatically and now there are several such disciplines, including machine learning, pattern recognition, and data mining. This paper looks at some of the similarities and some of the differences between these disciplines, noting where they intersect and, perhaps of more interest, where they do not. Particular issues examined include the nature of the data with which they are concerned, the role of mathematics, differences in the objectives, how the different areas of application have led to different aims, and how the different disciplines have led sometimes to the same analytic tools being developed, but also sometimes to different tools being developed. Some conjectures about likely future developments are given.

CONFERENCE PAPER

Hand DJ, Hand DJ, Hand DJ, 2004, Academic obsessions and classification realities: ignoring practicalities in supervised classification, Berlin, Meeting of the Interantional-Federation-of-Classifications-Societies (IFCS), Illinois Institute of Technology, Chicago, IL, Publisher: Springer-Verlag, Pages: 209-232, ISSN: 1431-8814

Supervised classification methods have been the focus of a vast amount of research in recent decades, within a variety of intellectual disciplines; including statistics, machine learning, pattern recognition, and data mining. Highly sophisticated methods have been developed, using the full power of recent advances in computation. Many of these methods would have been simply inconceivable to earlier generations. However, most of these advances have largely taken place within the context of the classical supervised classification paradigm of data analysis. That is, a classification rule is constructed based on a given 'design sample' of data, with known and well-defined classes, and this rule is then used to classify future objects. This paper argues that this paradigm is often, perhaps typically, an over-idealisation of the practical realities of supervised classification problems. Furthermore, it is also argued that the sequential nature of the statistical modelling process means that the large gains in predictive accuracy are achieved early in the modelling process. Putting these two facts together leads to the suspicion that the apparent superiority of the highly sophisticated methods is often illusory: simple methods are often equally effective or even superior in classifying new data points.

CONFERENCE PAPER

Hand, David, 2004, Pattern discovery (Preface), Journal of Applied Statistics, Vol: 31, Pages: 883-884, ISSN: 0266-4763

JOURNAL ARTICLE

McDonald RA, Eckley IA, Hand DJ, McDonald RA, Eckley IA, Hand DJ, McDonald RA, Eckley IA, Hand DJet al., 2004, A classifier combination tree algorithm, Berlin, 10th international workshop on structural and syntactic pattern recognition / 5th international conference on statistical techniques in pattern recognition, Lisbon, Portugal, Publisher: Springer-Verlag Berlin, Pages: 609-617, ISSN: 0302-9743

In recent years a number of authors have suggested that combining classifiers within local regions of the measurement space might yield superior classification performance to rigid global weighting schemes. In this paper we describe a modified version of the CART algorithm, called ARPACC, that performs local classifier combination. One obstacle to such combination is the fact that the 'optimal' covariance combination results originally assumed only two classes and classifier unbiasedness. In this paper we adopt an approach based on minimizing the Brier score and introduce a generalized matrix inverse solution for use in cases where the error matrix is singular. We also report some preliminary experimental results on simulated data.

CONFERENCE PAPER

Mcdonald RA, Hand DJ, Eckley IA, MCDONALD ROSSA, HAND DAVIDJ, ECKLEY IDRISAet al., 2004, A multiclass extension to the Brownboost algorithm, INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, Vol: 18, Pages: 905-931, ISSN: 0218-0014

JOURNAL ARTICLE

Bolton RJ, Hand DJ, Webb AR, 2003, Projection techniques for nonlinear principal component analysis, STATISTICS AND COMPUTING, Vol: 13, Pages: 267-276, ISSN: 0960-3174

JOURNAL ARTICLE

Hand DJ, 2003, Statistics and the Theory of Measurement

Just as there are different interpretations of probability, leading to different kinds of inferential statements and different conclusions about statistical models and questions, so there are different theories of measurement, which in turn may lead to different kinds of statistical model and possibly different conclusions. This has led to much confusion and a long running debate about when different classes of statistical methods may legitimately be applied. This paper outlines the major theories of measurement and their relationships and describes the different kinds of models and hypotheses which may be formulated within each theory. One general conclusion is that the domains of applicability of the two major theories are typically different, and it is this which helps apparent contradictions to be avoided in most practical applications.

SCHOLARLY EDITION

Hand DJ, Henley WE, 2003, Statistical Classification Methods in Consumer Credit Scoring: A Review

Credit scoring is the term used to describe formal statistical methods used for classifying applicants for credit into "good" and "bad" risk classes. Such methods have become increasingly important with the dramatic growth in consumer credit in recent years. A wide range of statistical methods has been applied, though the literature available to the public is limited for reasons of commercial confidentiality. Particular problems arising in the credit scoring context are examined and the statistical methods which have been applied are reviewed.

SCHOLARLY EDITION

Hand DJ, Vinciotti V, Hand DJ, Vinciotti Vet al., 2003, Choosing k for two-class nearest neighbour classifiers with unbalanced classes, PATTERN RECOGNITION LETTERS, Vol: 24, Pages: 1555-1562, ISSN: 0167-8655

JOURNAL ARTICLE

Hand DJ, Vinciotti V, Hand DJ, Vinciotti V, Hand DJ, Vinciotti Vet al., 2003, Local versus global models for classification problems: Fitting models where it matters, AMERICAN STATISTICIAN, Vol: 57, Pages: 124-131, ISSN: 0003-1305

JOURNAL ARTICLE

Hand DJ, 2003, Pattern discovery in data mining, Roma, Analisi statistica multivariata per le sceinze economico-sociali, le science naturali e la tecnologia, Publisher: Societ Italialia di Statistica, Pages: 15-26

CONFERENCE PAPER

Hand DJ, 2003, Choosing the right 'optimal' model in supervised classification, Literacia e Estaistica, Actas do X Congresso Anual da Socieda Portugesa de Estatstica, Porto, 25 - 28 September 2002, Pages: 31-41

CONFERENCE PAPER

Hand DJ, 2003, Individual freedom and the choice of umbrella, Kingston, Ontario, Statistics, science and public policy VII : environment, health and globalization : proceedings of the conference on statistics, science and public policy held at Herstmonceux Castle, Hailsham, U.K., 17 - 20 April 2002, Publisher: Queen's University, Pages: 213-218

CONFERENCE PAPER

Hand DJ, 2003, Selling mackerel by the pound, Kingston, Ontario, Statistics, science and public policy VII : environment, health and globalization : proceedings of the conference on statistics, science and public policy held at Herstmonceux Castle, Hailsham, U.K., 17 - 20 April 2002, Publisher: Queen's University, Pages: 67-72

CONFERENCE PAPER

King MD, Crowder MJ, Hand DJ, Harris NG, Williams SR, Obrenovitch TP, Gadian DG, King MD, Crowder MJ, Hand DJ, Harris NG, Williams SR, Obrenovitch TP, Gadian DG, King MD, Crowder MJ, Hand DJ, Harris NG, Williams SR, Obrenovitch TP, Gadian DG, King MD, Crowder MJ, Hand DJ, Harris NG, Williams SR, Obrenovitch TP, Gadian DGet al., 2003, Temporal relation between the ADC and DC potential responses to transient focal ischemia in the rat: A Markov chain Monte Carlo simulation analysis, JOURNAL OF CEREBRAL BLOOD FLOW AND METABOLISM, Vol: 23, Pages: 677-688, ISSN: 0271-678X

Markov chain Monte Carlo simulation was used in a reanalysis of the longitudinal data obtained by Harris et al. (J Cereb Blood Flow Metab 20:28-36) in a study of the direct current (DC) potential and apparent diffusion coefficient (ADC) responses to focal ischemia. The main purpose was to provide a formal analysis of the temporal relationship between the ADC and DC responses, to explore the possible involvement of a common latent (driving) process. A Bayesian nonlinear hierarchical random coefficients model was adopted. DC and ADC transition parameter posterior probability distributions were generated using three parallel Markov chains created using the Metropolis algorithm. Particular attention was paid to the within-subject differences between the DC and ADC time course characteristics. The results show that the DC response is biphasic, whereas the ADC exhibits monophasic behavior, and that the two DC components are each distinguishable from the ADC response in their time dependencies. The DC and ADC changes are not, therefore, driven by a common latent process. This work demonstrates a general analytical approach to the multivariate, longitudinal data-processing problem that commonly arises in stroke and other biomedical research.

JOURNAL ARTICLE

King M, Crowder MJ, Hand DJ, Harris NG, Williams SR, Obrenovitch TP, Gadian DGet al., 2003, Is there an ADC threshold for depolarisation? An MCMC analysis (Available on CD-ROM), Berkley, CA, 11th annual meeting of the International Society for Magnetic Resonance in Medicine, Toronto, ON, Canada, 10 - 16 July 2003, Publisher: International Society for Magnetic Resonance in Medicine, Pages: 1944-1944

CONFERENCE PAPER

McDonald RA, Hand DJ, Eckley IA, McDonald RA, Hand DJ, Eckley IA, McDonald RA, Hand DJ, Eckley IAet al., 2003, An empirical comparison of three boosting algorithms on real data sets with artificial class noise, Berlin, Multiple classifier systems, 4th international workshop, MCS 2003, Guildford, UK, 11 -13 June 2003: proceedings, Publisher: Springer, Pages: 35-44, ISSN: 0302-9743

Boosting algorithms are a means of building a strong ensemble, classifier by aggregating a sequence of weak. hypotheses. In this paper we consider three of the best-known boosting algorithms: Adaboost [9], Logitboost [11] and Brownboost [8]. These algorithms are adaptive, and work by maintaining a set of example and class weights which focus the attention of a base learner on the examples that are hardest to classify. We conduct an empirical study to compare the performance of these algorithms, measured in terms, of overall test error rate, on five real data sets. The tests consist of a series of cross-validatory samples. At each validation, we set aside one third of the data chosen at random as a test set, and fit the boosting algorithm to the remaining two thirds, using binary stumps as a base learner. At each stage we record the final training and test error rates, and report the average,errors within a 95% confidence interval. We then add artificial class, noise to our data sets by randomly reassigning 20% of class labels, and repeat our experiment. We find that Brownboost and Logitboost prove less likely than Adaboost to overfit in this circumstance.

CONFERENCE PAPER

Till RJ, Hand DJ, Till R, Hand Det al., 2003, Behavioural models of credit card usage, JOURNAL OF APPLIED STATISTICS, Vol: 30, Pages: 1201-1220, ISSN: 0266-4763

JOURNAL ARTICLE

Yearling D, Hand DJ, 2003, A Bayesian network datamining approach for modelling the physical condition of copper access networks, BT TECHNOLOGY JOURNAL, Vol: 21, Pages: 90-100, ISSN: 1358-3948

JOURNAL ARTICLE

Benton TC, Hand DJ, 2002, Segmentation into predictable classes, IMA Journal of Management Mathematics, Vol: 13, Pages: 245-260, ISSN: 1471-678X

JOURNAL ARTICLE

Bolton RJ, Hand DJ, 2002, Statistical fraud detection: A review, STATISTICAL SCIENCE, Vol: 17, Pages: 235-249, ISSN: 0883-4237

JOURNAL ARTICLE

Bolton RJ, Hand DJ, Adams NM, Bolton RJ, Hand DJ, Adams NMet al., 2002, Determining hit rate in pattern search, New York, Pattern detection and discovery, ESF exploratory workshop, London, UK, 16 - 19 September, 2002, Publisher: Springer, Pages: 36-48

CONFERENCE PAPER

Denison DGT, Adams NM, Holmes CC, Hand DJ, Denison DGT, Adams NM, Holmes CC, Hand DJ, Denison DGT, Adams NM, Holmes CC, Hand DJ, Denison DGT, Adams NM, Holmes CC, Hand DJet al., 2002, Bayesian partition modelling, Meeting on Nonlinear Methods and data Mining_(NMDM2000), Publisher: ELSEVIER SCIENCE BV, Pages: 475-485, ISSN: 0167-9473

This paper reviews recent ideas in Bayesian classification modelling via partitioning. These methods provide predictive estimates for class assignments using averages of a sample of models generated from the posterior distribution of the model parameters. We discuss modifications to the basic approach more suitable for problems when there are many predictor variables and/or a large training smple. (C) 2002 Elsevier Science B.V. All rights reserved.

CONFERENCE PAPER

Fayers PM, Hand DJ, Fayers PM, Hand DJet al., 2002, Causal variables, indicator variables and measurement scales: an example from quality of life, JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, Vol: 165, Pages: 233-253, ISSN: 0964-1998

JOURNAL ARTICLE

Fayers PM, Hand DJ, Fayers PM, Hand DJet al., 2002, Causal variables, indicator variables and measurement scales: an example from quality of life (with discussion), Journal of the Royal Statistical Society Series A - Statistics in Society, Vol: 165, Pages: 233-262, ISSN: 0964-1998

JOURNAL ARTICLE

Hand DJ, 2002, Pattern detection and discovery, New York, Pattern detection and discovery: ESF Exploratory Workshop, London, UK, 16 - 19 September 2002, Publisher: Springer, Pages: 1-12

CONFERENCE PAPER

Hand DJ, 2002, Discussion and conclusions, Kingston, Ontario, Statistics, science and public policy VI : science and responsibility : proceedings of the conference on statistics, science and public policy held at Herstmonceux Castle, Hailsham, U.K., 18 -21 April 2001, Publisher: Queen's University, Pages: 267-271

CONFERENCE PAPER

Hand DJ, 2002, A discussion of cultures: two or three, Kingston, Ontario, Statistics, science and public policy VI : science and responsibility : proceedings of the conference on statistics, science and public policy held at Herstmonceux Castle, Hailsham, UK., 18 -21 April 2001, Publisher: Queen's University, Pages: 81-83

CONFERENCE PAPER

Hand DJ, 2002, Artificial intelligence, Encyclopedia of environmetrics, Editors: el-Shaarawi, Piegorsch, Chichester, Publisher: Wiley, Pages: 1-6, ISBN: 9780471899976

BOOK CHAPTER

Hand DJ, 2002, Artificial neural networks, Encyclopedia of environmetrics, Editors: el-Shaarawi, Piegorsch, Chichester, Publisher: Wiley, Pages: 1-7, ISBN: 9780471899976

BOOK CHAPTER

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: id=00171082&limit=30&person=true&page=7&respub-action=search.html