282 results found
Allin P, Hand DJ, From a system of national accounts to a process of national wellbeing accounting, International Statistical Review, ISSN: 1751-5823
There are repeated calls to go “Beyond GDP”, for measures of wellbeing and progress in addition to those that the System of National Accounts (SNA) is designed to provide. We identify key issues that can help build on the rigour of SNA whilst fitting the measurement of economic performance within a broader assessment of national wellbeing and progress. Such drivers are already leading to a proliferation of indicators and accounts, for example in the development of non-monetary measures of natural resources. There are significant measurement challenges, not least the question of whether a single, overall measure or index of wellbeing is valid. But the challenge of measurement, per se, is one thing: in our view, a more critical issue is whether the measures will actually be used. We propose a dynamic and multi-staged approach for developing SNA, embracing the production and use of measures. This would start by identifying user requirements for wider measures, to provide the basis for national and cross-national developments in wellbeing accounting. We envisage greater branding and marketing of national wellbeing concepts to promote measures and support their use. We call for outreach by producers, so that there is dialogue about the development and use of measures.
Hand DJ, Big data and data sharing, Journal of the Royal Statistical Society Series A - Statistics in Society, ISSN: 0964-1998
Hand DJ, Measurement: A Very Short Introduction - Rejoinder to discussion, Measurement: Interdisciplinary Research and Perspectives
Allin P, Hand DJ, Allin P, et al., 2017, New statistics for old?-measuring the wellbeing of the UK, JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, Vol: 180, Pages: 3-24, ISSN: 0964-1998
Attempts to create measures of national wellbeing and progress have a long history. Inthe UK, they go back at least as far as the 1790s, with Sir John Sinclair’s Statistical Accountof Scotland. More recently, worldwide interest has led to the creation of a number of indicesseeking to go beyond familiar economic measures like GDP. We review the MeasuringNational Well-being development programme of the UK’s Office for National Statistics, andexplore some of the challenges which need to be faced to bring wider measures into use.These include: the importance of getting the measures adopted as policy drivers; how tochallenge the continuing dominance of economic measures; sustainability and environmentalissues; international comparability; and methodological statistical questions.
Hand D, Christen P, Hand DJ, et al., 2017, A note on using the F-measure for evaluating record linkage algorithms, Statistics and Computing, ISSN: 0960-3174
Record linkage is the process of identifying and linking records about the same entities from one or more databases. Record linkage can be viewed as a classification problem where the aim is to decide whether a pair of records is a match (i.e. two records refer to the same real-world entity) or a non-match (two records refer to two different entities). Various classification techniques—including supervised, unsupervised, semi-supervised and active learning based—have been employed for record linkage. If ground truth data in the form of known true matches and non-matches are available, the quality of classified links can be evaluated. Due to the generally high class imbalance in record linkage problems, standard accuracy or misclassification rate are not meaningful for assessing the quality of a set of linked records. Instead, precision and recall, as commonly used in information retrieval and machine learning, are used. These are often combined into the popular F-measure, which is the harmonic mean of precision and recall. We show that the F-measure can also be expressed as a weighted sum of precision and recall, with weights which depend on the linkage method being used. This reformulation reveals that the F-measure has a major conceptual weakness: the relative importance assigned to precision and recall should be an aspect of the problem and the researcher or user, but not of the particular linkage method being used. We suggest alternative measures which do not suffer from this fundamental flaw.
Hand DJ, 2016, The case against a paradigm shift in the way we use data, FST Journal, Vol: 21, Pages: 10-12, ISSN: 1475-1704
Hand DJ, 2016, Measurement: a very short introduction, Publisher: Oxford University Press, ISBN: 9780198779568
Measurement underpins all of modern society, from science, through medicine, to management, economics, and government. This book describes the history of measurement, and presents a unified theory of measurement, covering all its aspects from measuring mass and length to measuring pain, depression, GDP, and beyond.
Hand DJ, Hand DJ, Hand DJ, 2016, Editorial: 'Big data' and data sharing, JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, Vol: 179, Pages: 629-631, ISSN: 0964-1998
Hand DJ, Hand DJ, Hand DJ, et al., 2015, From evidence to understanding: a commentary on Fisher (1922) 'On the mathematical foundations of theoretical statistics', PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, Vol: 373, Pages: 20140252-20140252, ISSN: 1364-503X
The nature of statistics has changed over time. Itwas originally concerned with descriptive ‘mattersof state’—with summarizing population numbers,economic strength and social conditions. But duringthe course of the twentieth century its aim broadenedto include inference—how to use data to shed light onunderlying mechanisms, about what might happen inthe future, about what would happen if certain actionswere taken. Central to this development was RonaldFisher. Over the course of his life he was responsiblefor many of the major conceptual advances instatistics. This is particularly illustrated by his 1922paper, in which he introduced many of the conceptswhich remain fundamental to our understanding ofhow to extract meaning from data, right to the presentday. It is no exaggeration to say that Fisher’s work, asillustrated by the ideas he described and developedin this paper, underlies all modern science, andmuch more besides. This commentary was writtento celebrate the 350th anniversary of the journalPhilosophical Transactions of the Royal Society
Allin P, Hand DJ, 2014, The Wellbeing of Nations Meaning, Motive and Measurement, Publisher: John Wiley & Sons, ISBN: 9781118489574
Slowly we are learning to better count what really matters in our lives. This book explains the international collaboration behind this new learning and moves it far forward.
Hand D, 2014, The Improbability Principle Why coincidences, miracles and rare events happen all the time, Publisher: Random House, ISBN: 9781448170661
Here, in this highly original book - aimed squarely at anyone with an interest in coincidences, probability or gambling - eminent statistician David Hand answers this question by weaving together various strands of probability into a ...
Hand DJ, Anagnostopoulos C, Hand DJ, et al., 2014, A better Beta for the H measure of classification performance, PATTERN RECOGNITION LETTERS, Vol: 40, Pages: 41-46, ISSN: 0167-8655
Hand DJ, Hand DJ, 2014, Wonderful Examples, but Let's not Close Our Eyes, STATISTICAL SCIENCE, Vol: 29, Pages: 98-100, ISSN: 0883-4237
Hand DJ, 2013, Data, Not Dogma: Big Data, Open Data, and the Opportunities Ahead, 12th International Symposium on Intelligent Data Analysis (IDA), Publisher: SPRINGER-VERLAG BERLIN, Pages: 1-12, ISSN: 0302-9743
Hand DJ, Anagnostopoulos C, Hand DJ, et al., 2013, When is the area under the receiver operating characteristic curve an appropriate measure of classifier performance?, PATTERN RECOGNITION LETTERS, Vol: 34, Pages: 492-495, ISSN: 0167-8655
Hand DJ, Hand DJ, 2013, Latent Variable Models and Factor Analysis: A Unified Approach, 3rd Edition, INTERNATIONAL STATISTICAL REVIEW, Vol: 81, Pages: 333-334, ISSN: 0306-7734
Hand DJ, Hand DJ, 2013, A Practitioner's Guide to Resampling for Data Analysis, Data Mining, and Modeling, INTERNATIONAL STATISTICAL REVIEW, Vol: 81, Pages: 326-326, ISSN: 0306-7734
Hand DJ, Hand DJ, 2013, Living Standards Analytics: Development through the Lens of Household Survey Data, INTERNATIONAL STATISTICAL REVIEW, Vol: 81, Pages: 331-332, ISSN: 0306-7734
Hand DJ, Hand DJ, 2013, Comparing Groups: Randomization and Bootstrap Methods Using R, INTERNATIONAL STATISTICAL REVIEW, Vol: 81, Pages: 326-328, ISSN: 0306-7734
Hand DJ, Hand DJ, 2013, A Statistical Guide for the Ethically Perplexed, INTERNATIONAL STATISTICAL REVIEW, Vol: 81, Pages: 314-316, ISSN: 0306-7734
Hand DJ, Hand DJ, 2013, Introduction to Probability with Texas Hold'em Examplese, INTERNATIONAL STATISTICAL REVIEW, Vol: 81, Pages: 334-334, ISSN: 0306-7734
Hand DJ, Hand DJ, 2013, Statistical Evaluation of Diagnostic Performance: Topics in ROC Analysis, INTERNATIONAL STATISTICAL REVIEW, Vol: 81, Pages: 335-335, ISSN: 0306-7734
Hand DJ, Hand DJ, 2013, Graphical Models with R, INTERNATIONAL STATISTICAL REVIEW, Vol: 81, Pages: 316-316, ISSN: 0306-7734
Hand DJ, Hand DJ, Hand DJ, 2013, From Evidence to Understanding: A Precarious Path, EUROPEAN REVIEW, Vol: 21, Pages: S32-S39, ISSN: 1062-7987
Falsifiability is the cornerstone of science. However, Rutherford notwithstanding, almost by definition science functions at the limits of measurement accuracy and theoretical grasp, so that statistical analysis is central to scientific advance. This applies as much to physics as it does to psychology, as much to geology as to biology. I look at some of the potholes in the path of scientific discovery, showing how easy it is to stumble, and at some of the consequences for the scientific endeavour.
Henrion M, Hand DJ, Gandy A, et al., 2013, CASOS: a Subspace Method for Anomaly Detection in High Dimensional Astronomical Databases, STATISTICAL ANALYSIS AND DATA MINING, Vol: 6, Pages: 53-72, ISSN: 1932-1864
We develop a novel algorithm for detecting anomalies. Our method has been developed to suit the challenging task of detecting anomalous sources in cross-matched astronomical survey data. Our algorithm computes anomaly scores in lower-dimensional subspaces of the data. By subspaces we mean, in this work, subsets of the original data variables. Our technique presents several advantages over existing methods: it can work directly on data with missing values; it addresses some of the problems posed by high-dimensional data spaces; it is less susceptible to a masking effect from irrelevant features; it can be easily adapted to suit specific needs and it allows an easier interpretation of why a given object has a high combined anomaly score. One drawback of our method is that it cannot detect outliers that are only apparent in high-dimensional spaces. Anomaly scores are computed using a nearest neighbor (NN) technique, but the algorithm works with any other method computing numerical anomaly scores. We demonstrate the properties of our algorithm and evaluate its performance on both simulated and real datasets. We show that it is capable of outperforming state-of-the-art, full-dimensional approaches in some situations. © 2012 Wiley Periodicals, Inc., A Wiley Company.
Bentham J, Hand DJ, Bentham J, et al., 2012, Data mining from a patient safety database: the lessons learned, DATA MINING AND KNOWLEDGE DISCOVERY, Vol: 24, Pages: 195-217, ISSN: 1384-5810
The issue of patient safety is an extremely important one; each year in the UK, hundreds of thousands of people suffer due to some sort of incident that occurs whilst they are in National Health Service care. The National Patient Safety Agency (NPSA) works to try to reduce the scale of the problem. One of its major projects is to collect a very large dataset, the Reporting and Learning System (RLS), which describes several million of these incidents. The RLS is used as the basis for research by the NPSA. However, the NPSA has identified a gap in their work between high-level quantitative analysis and detailed, manual analysis of small samples. This paper describes the lessons learned from a knowledge discovery process that attempted to fill this gap. The RLS contains a free text description of each incident. A high dimensional model of the text is calculated, using the vector space model with term weighting applied. Dimensionality reduction techniques are used to produce the final models of the text. These models are examined using an anomaly detection tool to find groups of incidents that should be coherent in meaning, and that might be of interest to the NPSA. A three stage process is developed for assessing the results. The first stage uses a quantitative measure based on the use of planted groups of known interest, the second stage involves manual filtering by a non-expert, and the third stage is assessment by clinical experts. © 2011 The Author(s).
Hand DJ, 2012, Credit scoring, insurance and discrimination, Statistics, Science and Public Policy XVI, Editors: Herzberg, Pages: 85-90, ISBN: 9781553393825
Hand DJ, Crowder MJ, Hand DJ, et al., 2012, Overcoming selectivity bias in evaluating new fraud detection systems for revolving credit operations, INTERNATIONAL JOURNAL OF FORECASTING, Vol: 28, Pages: 216-223, ISSN: 0169-2070
Hand DJ, Hand DJ, 2012, Statistical Concepts: A Second Course, 4th edition, INTERNATIONAL STATISTICAL REVIEW, Vol: 80, Pages: 491-491, ISSN: 0306-7734
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.