Statistics seminar
The seminars are currently held on Fridays at the Department of Mathematics, Huxley Building room 130 (180 Queens Gate, South Kensington, number 13 on this map).
The seminars are organized by Nikolas Kantas.
Announcements concerning the seminar are sent via a mailing list (to which you can subscribe here).
Upcoming Statistics seminars
ArchiveStatistics Seminars Pre2011 Archive
Seminars 2010  2011
Friday 17th June, Huxley 340
14:30  Anthony Lee (University of Oxford)
Bayesian Sparsity Path Analysis using Hierarchical Shrinkage Priors
Variable selection techniques have become increasingly popular amongst statisticians due to an increased number of regression and classification applications involving highdimensional data where we expect some predictors to be unimportant. In this context, shrinkage priors and maximum a posteriori (MAP) estimates are often used to identify important variables. A hierarchical framework for specifying shrinkage priors is introduced and we motivate the use of these priors in exploratory data analysis using both Bayesian inference and MAP estimation. Generating posterior distributions over a range of prior specifications is computationally challenging but naturally amenable to sequential Monte Carlo (SMC) algorithms indexed on the scale parameter of the prior. We show how SMC simulation on graphics processing units (GPUs) provides the computational power required for timely inference. An example from casecontrol genomewide association studies is presented.
16:00  Prof Stephen Walker (University of Kent)
Sampling the mixture of Dirichlet process model with a power likelihood
If a likelihood is raised to a power in (0,1) then the sequence of Bayesian posterior distributions are known to be strongly consistent. However, for the popular mixture of Dirichlet process model it is not clear how to undertake posterior inference via MCMC with the likelihood raised to a power in (0,1). By taking a power just less than one for example would guarantee a consistent sequence of posteriors which should be extremely similar to the standard likelihood. The talk will show how to do posterior inference for the mixture of Dirichlet process model with the likelihood raised to a power in (0,1)
Friday, 10 December 2010
14:30  Prof Sofia Olhede (University College London)
Statistical Methods for Analysis of Diffusion Weighted Magnetic Resonance Imaging
High angular resolution diffusion imaging data is the observed characteristic function for the local diffusion of water molecules in tissue. This data is used to infer structural information in brain imaging. Nonparametric scalar measures are proposed to summarize such data, and to locally characterize spatial features of the diffusion probability density function (PDF), relying on the geometry of the characteristic function. Summary statistics are defined so that their distributions are, to first order, both independent of nuisance parameters and analytically tractable. The dominant direction of the diffusion at a spatial location (voxel) is determined, and a new set of axes are introduced in Fourier space. Variation quantified in these axes determines the local spatial properties of the diffusion density. Nonparametric hypothesis tests for determining whether the diffusion is unimodal, isotropic or multimodal are proposed. More subtle characteristics of whitematter microstructure, such as the degree of anisotropy of the PDF and symmetry compared with a variety of asymmetric PDF alternatives, may be ascertained directly in the Fourier domain without parametric assumptions on the form of the diffusion~PDF. We simulate a set of diffusion processes and characterize their local properties using the newly introduced summaries. We show how complex whitematter structures across multiple voxels exhibit clear ellipsoidal and asymmetric structure in simulation, and assess the performance of the statistics in clinicallyacquired magnetic resonance imaging data. Joint work with Brandon Whitcher, GSK.
16:00  Dr Christoforos Anagnostopoulos (University of Cambridge)
Handling temporal variation of unknown characteristics in streaming data analysis
Data collection technology is undergoing a revolution that is enabling streaming acquisition of realtime information in a wide variety of settings. Faced with indefinitely long, high frequency and possibly high dimensional data sequences, learning algorithms must rely on summary statistics and computationally efficient online inference without the need to store and revisit the data history. Moreover, learning must be temporally adaptive in order to remain uptodate against unforeseen changes, smooth or abrupt, in the underlying data generation mechanism. In cases where explicit dynamic modelling is either impossible or impractical, temporally adaptive behaviour may still be induced by controlling the responsiveness of the estimator to novel information. We discuss ways in which this can be accomplished in datadependent manners for popular classes of online algorithms. We focus on the RobbinsMonro family of algorithms that naturally features a sequence of userspecified learning rates, and discuss available methodology for automatic selftuning in this context. On the basis of novel theoretical insights, as well as realdata experiments, we demonstrate the advantages and shortcomings of such approaches in handling temporal variation of unknown characteristics
Friday, 3 December 2010
Transfer Talks
Paul Ginzberg
Processing of proper and improper quaternion signals
Complex signal processing uses the algebraic structure of complex numbers to account for relationships between the real and imaginary components of a signal and processes them jointly. Using the 4dimensional algebraic structure of quaternions can provide similar insight when dealing with 4component signals. Algebraic structure is directly related to patterned covariance matrices. We can test for this pattern, and it's presence (propriety) or absence (impropriety) has implications on the choice of techniques used to process the signal.
Swati Chandna
Simulating Improper ComplexValued Processes via Circulant Embedding
The technique of circulant embedding has been used for simulat ing realizati ons from certain realvalued Gaussian stat ionar y processes. We show how this technique can be adapted to handle complexvalued processes to generate realizations from an improper complexvalued Gaussian stationary process with a priori prescribed secondo rder st atistics. Thi s technique has two potential advantages over other competing methods for simulating time series. First, this m e t ho d is based on a discrete Fourier transform which makes it computationally attractive. Secondly, it generates exact realizations as opposed to approximate realizations. In practice, especially when dealing with applications in engineering and physical sciences, it is more likely to be provided with a set of time series data rather than a model with secondorder statistics. In such situations it is useful to generate simulated time series whose statistical properties closely resemble the time series under study. We show how certain nonparametric spectral estimators from the given time series can be used together wi th the c ir culant embedding approach to generate the required realizations. We also provide results which show that this methodology provides a good nonparametric procedure for bootstrapping time series to assess the sampling variability in certain statistics of interest.
Matt Silver
Simultaneous pathway and SNP selection us ing the grouped lasso
I will present a method for the ra nking of significant biological pathways associated with a quantitative phenotype, using group lasso penalized regression in which SNPs are grouped into functionally related gene sets. In addition, the method simultaneously ranks significant SNPs within selected pathways. An important distinguishing feature of the method is its ability to account for the presence of overlapping gene sets, arising from the typically large number of genes that are assigned to multiple pathways. The use of pointwise stability selection combined with sparse regression across multiple pathways makes the method highly computationally efficient when compared with other methods that use permutations to rank significance. Using simulated quantitative phenotypes generated using real genotype and pathway data, we find that our method performs well when compared with other widelyused methods for pathway and SNP selection.
Elena Ehrlich
Adaptive Filtering and Timevarying Systems
For fully specified Linear Gaussian StateSpace models, it is well known that the Kalman Filter (KF) provides optimal update equations. In many real problems the statespace model cannot be fully specified, and modifications to standard filtering methodology or alternative estimation methods are required to estimate the states and parameters simultaneously. The problem is significantly harder if model parameters are varying in time, particularly in unpredicted ways. Our objective is to explore the utility of Adaptive Filtering in such challenging problems. In particular, we develop a formal relationship between KF and Recursive Least Squares with Adaptive Forgetting (RLSAF), and describe how this relationship can be exploited to identify a timevarying model which is not fully specified.
19 November 2010
14:30  Dr Thomas Nichols (Warwick University)
Title: Point Process Modelling of Brain Image Data
Abstract: The s tandard approach to brain imaging data analysis is a massunivariate one. At each voxel (volume element) a linear model is fit, ignoring all other voxels. While this has obvious computational advantages, it cannot capture the explicitly spatial nature of brain activations that are of interest. We propose a hierarchal point process approach, modelling brain activation as a mixture of latent activation centers, and these activation centers are in turn modelled as offspring from latent population centers. This approach allows inference on the location of population centers, separately estimating the uncertainty of the population location and the uncertainty of individual activation center's location about that population center (akin to the distinction between standard error and standard
deviation). We also show how this approach naturally encompasses coordinatevalued data, such as generated by neuroimaging metaanalyses and Multiple Sclerosis lesion data.
16:00  Dr Enrico Petretto (Imperial College London)
Title: Integrated systemsgenetics approaches: deciphering the biological function of genes and gene networks that drive disease
Abstract: Combined analyses of gene networks and DNA sequence variation can provide new insights into the aetiology of common diseases that may not be apparent from genome wide association studies (GWASs) alone. Recent advances in rat genomics now make systemsgenetics approaches possible, which we developed to identify and functionally characterise biological processes associated with disease. We used integrated genomewide approaches across seven rat tissues to identify gene networks and the loci underlying their regulation. We defined an interferon regulatory factor 7 (IRF7)driven inflammatory network (IDIN) enriched for viral response genes, which represents a molecular biomarker for macrophages and was regulated in multiple tissues by a locus on rat chromosome 15q25. We show that Epstein–Barr virus induced gene 2 (Ebi2, also known as Gpr183), which lies at this locus and controls B lymphocyte migration, is expressed in macrophages and regulates the IDIN. The human orthologous locus on chromosome 13q32 controlled the human equivalent of the IDIN, which was conserved in monocytes. IDIN genes were more likely to associate with susceptibility to type 1 diabetes (T1D), a macrophageassociated autoimmune disease, than randomly selected immune response genes (P = 8.85 x 10−6). The human locus controlling the IDIN was associated with the risk of T1D at single nucleotide polymorphism rs9585056 (P = 7.0 x 10−10; odds ratio 1.15), which was one of five single nucleotide polymorphisms in this region associated with EBI2 (GPR183) expression. These data implicate IRF7 network genes and their transacting regulatory locus in the pathogenesis of T1D and show that coexpression networks across species provide functional annotation of genes in biological processes that can be used to reveal the signal of common genetic variation of small effect that is not detected by GWAS.
Friday, 15 October 2010
14:30  Dr Jim Griffin (University of Kent)
Title: Structuring Shrinkage
Abstract: The effectiveness (and shortcomings) of penalized maximum likelihood methods, such as the Lasso, in regression has lead to a renewed interest in the choice of prior distribution for regression coefficients in a Bayesian analysis. This talk will consider independent NormalGamma distributions as a prior and illustrate how they can effectively control model complexity. The prior acts as a starting point for defining priors with dependence between regression coefficients and this will also be discussed.
16:00  Dr Nikolas Kantas (Imperial College London)
Title: Parameter Inference for rare events using Particle Markov chain Monte Carlo
Abstract: In this talk we consider parameter inference associated to stochastic processes that are stopped at the first hitting time of some rare set. Our approach is to use a recently introduced simulation methodology, Particle Markov Chain Monte Carlo (PMCMC), which is an MCMC algorithm that uses proposals generated using Sequential Monte Carlo (SMC) methods. However, standard SMC algorithms are not always appropriate for many stopped processes and we introduce new ideas based upon particle approximations of multilevel FeynmanKac formulae. The methodology is applied to the coalescent and a queuing model. This is joint work with Ajay Jasra and Arnaud Doucet.
Seminars 2009  2010
Friday, 15 January 2010
14:30  Alberto Cozzini (transfer talk)
Title: Penalised Robust Mixture Modelling using tStudent Distributions and Applications
Abstract: It is standard practice to divide financial markets among macro sectors. The criteria for the classification are usually based on the nature and fundamental characteristics of the goods exchanged. In our work we take a datadriven approach and cluster the markets on the basis of several indicators of price dynamics extracted from historical time series of the returns. Standard clustering algorithms, such as modelbased algorithms based on mixture distributions, assume that all the variables are informative for clustering and that each variable is normally distributed. However, we believe that certain financial indicators may not be informative for clustering and should not be used. Moreover, most indicators follow distributions with heavier tails compared to the normal distribution. In order to address these two issues, we propose a penalized modelbased clustering algorithm based on a mixture of tdistributions. The clustering algorithm is able to differentiate between variables that are important for clustering and irrelevant variables, and is robust against outliers. Statistical inference is carried out using an EM algorithm. Several alternative penalty functions will be discussed and experimental results based on both simulated and real data sets will be presented.
15:00  Christofer Minas (transfer talk)
Title: Differential Analysis in Gene Expression Time Courses using DistanceBased Functional Data Analysis
Abstract: In many biological settings, gene expression data are collected over timefrom subjects or replicates of different groups, classified as such byattributes such as treatment. Typically, the responses are comprised of only a few observed values and many missing values, thus making it harder to identify or rank the genes in which the responses of the subjects differ the most between the groups. We approach this problem from a novel functional perspective, considering the responses as noisy realizations of some smooth function, creating genespecific dissimilarity matrices from a functional principal component analysis, and using distancebased permutation testing to rank the genes in order of significance. In particular, we consider the pseudo Ftest, the MultiResponse Permutation Procedure (MRPP) and the Mantel test. Equivalences between these methods are discussed, before comparing their performance with a multivariate empirical Bayes method via a simulation study. The distancebased methods are then applied to a real dataset where responses from 9 dendritic and 9 macrophage cells are observed for 22282 genes, and genes in which responses are significantly different are highlighted.
16:00  Todd Kuffner (transfer talk)
Title: Matching Objective Bayesian and Conditional Frequentist Coverage Probabilities in Ancillary Statistic Models.
Abstract: Probability matching priors are a standard tool used to achieve higherorder matching of coverage probablities between Bayesian and unconditional frequentist confidence regions. We extend the analysis to ancillary statistic models, where the correct inference to be performed is a conditional one. Our results suggest that the method developed by BarndorffNielsen (1986) and extended by DiCiccio and Young (2009) is particularly useful in identifying priors which ensure higherorder matching. In some transformation models, the matching is exact in the sense that, if we choose a prior identified by our matching method, then the Bayesian confidence intervals have exactly the correct conditional frequentist interpretation.
16:30  Orlando Doehring (transfer talk)
Title: Curve Selection in MultiGroup Functional Principal Component Analysis
Abstract: We model variability of multigroup time series within a functional data analysis framework. That type of problems may arise in human motion data where each sample has three different groups of time series, one group for each spatial coordinate. Functional Principal Component Analysis (fPCA) is an unsupervised approach to data processing and dimensionality reduction where we have one group of functions or curves. The given discrete observations which may have been recorded at different time points and may not be equally spaced will be modelled functionally by projecting them onto a basis expansion. In multigroup fPCA each sample is paired with multiple groups of curves. For each sample those multiple groups are concatenated and unigroup fPCA is performed. The aim is to study variability of each of these groups and to compute a projection into the direction of maximum variability. But in multigroup fPCA the projected coordinates will be based on all the original groups. To improve interpretation we develop a sparse projection that does not depend on all original components. To reduce the size of explicitly used groups in the projection a new multigroup sparse PCA approach is proposed. Hence groups of curves are regularized related to ideas that were recently developed for grouped Lasso.
Friday, 18 December 2009
13:30  16:00 Astrostatistics Seminars
There is a growing interest in the use of modern statistical methods to tackle astronomical problems, driven partly by the huge astronomical data sets which are being collected. Statisticians and astronomers at Imperial already have at least one joint project underway, and other independent discussions have also taken place. We therefore thought it timely to organise a meeting to gauge the level of interest and bring together relevant parties, with the aim of stimulating crossdisciplinary work and new collaborations.
Programme:
Statistical and data mining tools for astronomical problems
David J. Hand, Maths
Statistical challenges in cosmology and astroparticle physics
Roberto Trotta, Physics
Two examples of astrostatistics: star/galaxy separation and anomaly detection
Marc Henrion, Maths
Inferring the properties of astronomical objects: triumph and disaster
Daniel Mortlock, Physics
Friday, 11 December 2009
14:30  Dr Sumeetpal S. Singh (Dept Engineering, Univ Cambridge)
Title: Recursive smoothing using Particle Filters
Abstract: Sequential Monte Carlo (SMC) methods are a widely used set of computational tools for inference in nonlinear nonGaussian statespace models. We propose a new SMC algorithm to perform smoothing recursively in time. Essentially, it is an online implementation of the forward filtering backward smoothing SMC algorithm proposed by Doucet, Godsill and Andrieu (2000). We show that the asymptotic variance of the path space SMC estimator increases quadratically in time whereas the increase is linear in time for our new SMC estimator. We then use the new SMC estimator to perform recursive parameter estimation using an SMC implementation of an online version of the ExpectationMaximization algorithm which does not suffer from the particle path degeneracy problem. This is joint work with Pierre Del Moral and Arnaud Doucet.
16:00  Prof Adrian Bowman (Department of Statistics, University of Glasgow)
Title: Modelling surface shape: from faces to brains
Abstract: Stereophotogrammetry provides highresolution data defining the shape of threedimensional o bjects. One example of its application is in facial surgery where images can be used to quantify the success of an operation and to quantify residual differences from control shape. Information can be extracted in a variety of forms. Methods of analysing landmark shape data are well developed but landmarks alone clearly do not adequately represent the very much richer information present in each digitised face. Facial curves with clear anatomical meaning can also be extracted. In order to exploit the full extent of the information present in the images, standardised meshes, whose nodes c orrespond across individuals, can also be fitted. Some of the issues involved in analysing data of these types will be discussed and illustrated on surgical data. The measurement of asymmetry and the construction of longitudinal models are of particular interest. A second form of surface data arises in the analysis of MEG data which is collected from the head surface of patients and gives information on underlying brain activity. In this case, spatiotemporal smoothing offers a route to a flexible model for the spatial and temporal locations of stimulated brain activity.
Friday, 6 November 2009
14:30  Prof. Robin Henderson (School of Mathematics and Statistics, University of Newcastle)
Title: Regret regression for optimal treatment allocation, with application to warfarin anticoagulation.
Abstract: TBA
16:00  Prof. Chris Holmes (Oxford Centre for Gene Function)
Title: Hierarchical Bayesian mixture models.
Abstract: We will discuss Bayesian approaches to clustering of data using mixture models. Clustering is often used to uncover hidden structure in data or to recover suspected structure. We show that by building hierarchical dependencies (via priors) on the mixing weights, mixture locations and mixture variances we can induce a rich set of generalisations of the original model which are well suited to features of many applications. The investigation is motivated by various studies in statistical genomics.
Friday, 16 October 2009
14:30  Dr Jenny Barrett (Section of Epidemiology and Biostatistics, Leeds Institute of Molecular Medicine)
Title: Applications of the Random Forest algorithm in proteomics and genetics
Abstract: The Random Forest (RF) algorithm was developed by Breiman as a classification tool and has been used successfully as a classifier in many contexts. We applied the RF algorithm to peaks extracted from mass spectrometry proteomic profiles as part of a "competition", designed to compare various approaches to classifying samples from breast cancer patients and controls. The algorithm performed well as a classifier and, unlike most other methods, provided good estimates of future performance from the training data set. An additional feature of the RF is a measure of the contribution of each variable to the success of the classification: variable importance measures. Using the same data set we have shown that these measures are quite stable over repeated runs of the algorithm and show good consistency among highly ranked variables between training and validation data sets. The measures have better properties when the RF is applied to a reduced data set, with low correlation between variables. The ranking of variables obtained from RF was compared with the results of simple univariate tests. We have further investigated the properties of variable importance measures in the context of genomewide association (GWA) studies. In a simulation study of GWA data, we have investigated the power to identify loci that truly influence disease risk either singly or in combination, compared with standard approaches. Despite the good properties of RF as a classifier, in most contexts studied to date we have found that the RF does not outperform much simpler methods of identifying important variables.
16:00 Dr. Tim Ebbels (Biological Chemistry Section of the Division of Biomedical Sciences, Imperial College)
Title: Methods and challenges in analysis of metabolomics data
Abstract: Metabolomics is the study of the levels of thousands of small molecular weight molecules found in cells, biofluids and tissues. It is of great interest because most biological changes (e.g. aging, disease etc.) affect metabolism and thus leave a characteristic fingerprint in the metabolic profile. Such profiles are information rich and complex and require statistical tools which are adapted specifically to their characteristics. In this talk I will review some of the challenges presented by this data such as multivariate classification, feature selection and metabolite identification. I will show examples of methods developed at Imperial to overcome these challenges including multivariate kernel density estimators, genetic algorithms and statistical correlation spectroscopy. Overall, it is clear that only a relatively small amount of the total information latent in metabolic profiles is currently being usefully extracted, thus posing important questions to be answered in the years ahead.
Seminars 2008  2009
Friday 
14.30 
Title: Tobit models for multivariate, spatiotemporal and compositional data 
Friday 
14.30 
Title: Intelligent Optimisation and Learning Using Tsallis Statistics This talk explores the use of the nonextensive qdistribution, which arises from the optimisation of the Tsallis entropy, in the context of intelligent optimisation. The continuous real parameter q that represents the degree of nonextensivity is used to generate various distributions, which have properties intermediate to that of Gaussian and Levy statistics. Adaptive mechanisms based on the qdistribution are used to enhance intelligent optimisation methods which employ diffusion and particle swarms. Application examples in highdimensional nonlinear optimisation problems and neural networks are discussed. 

16.00 
Title: Particle MCMC 
Monday 
11.00 
Title: Life Distributions in Survival Analysis and Reliability: Structure of Semiparametric Families 
Friday 
14.30 
Title: Quasivariances 

16.00 
Title: Some statistical aspects of SpatioTemporal Brain Image Analysis 
Friday 
14.30 
Title: On the incoherence of the area under the ROC curve, and what to do about it 

16.00 
Title: Why Income Comparison Is Rational 
Friday 
14.30 
Transfer Talks 
Friday 
14.30 
Title: Statistics for Human Motion Modelling 

16.00 
Title: Genome Wide Assocition analysis of WTCCC coronary artery disease (CAD) phenotype 
Friday 
14.00 
Title: 'What role should formal riskbenefit decisionmaking play in the regulation of medicines?' 
Friday 
16.00 
Title: A Bayesian View of Some Causal Inference Procedures 
Friday 
14.30 
Transfer Talks 
Friday 
14.30 
Title: Simple models for longitudinal data with informative observation 
Friday 
14.30 
Transfer Talks 
Friday 
14.30 
Title: The Dantzig selector in Cox's proportional hazards model 

16.00 
Title: Dissatisfied with schools of inference (Bayesian, likelihood, frequentist etc.)? How to build your own inference e ngine. 
Friday 
14.30 
Title: Stochastic Boosting 

16.00 
Title: Robust Approach to Graphical Modelling for Brain Connectivity via Partial Coherence 
Seminars 2007  2008
Friday 13/06/2008
Flexible Covariance Estimation in Gaussian Graphical models
Bala Rajaratnam (Department of Statistics, Stanford)
Covariance estimation is known to be a challenging problem, especially for highdimensional data. In this context, graphical models can act as a tool for regularization and have proven to be excellent tools for the analysis of high dimensional data. Graphical models are statistical models where dependencies between variables are represented by means of a graph. Both frequentist and Bayesian inferential procedures for graphical models have recently received much attention in the statistics literature. The hyperinverse Wishart distribution is a commonly used prior for Bayesian inference on covariance matrices in Gaussian Graphical models. This prior has the distinct advantage that it is a conjugate prior for this model but it suffers from lack of flexibility in high dimensional problems due to its single shape parameter.
In this talk, for posterior inference on covariance matrices in decomposable Gaussian graphical models, we use a flexible class of conjugate prior distributions defined on the cone of positivedefinite matrices with fixed zeros according to a graph G. This class includes the hyper inverse Wishart distribution and allows for up to k+1 shape parameters where k denotes the number of cliques in the graph. We first add to this class of priors, a reference prior, which can be viewed as an improper member of this class. We then derive the general form of the Bayes estimators under traditional loss functions adapted to graphical models and exploit the conjugacy relationship in these models to express these estimators in closed form. The closed form solutions allow us to avoid heavy computational costs that are usually incurred in these highdimensional problems. We also investigate decisiontheoretic properties of the standard frequentist estimator, which is the maximum likelihood estimator, in these problems. Furthermore, we illustrate the performance of our estimators through numerical examples and comparisons with previous work where we explore frequentist risk properties and the efficacy of graphs in the estimation of highdimensional covariance structures. We demonstrate that our estimators yield substantial risk reductions over the maximum likelihood estimator in the graphical model.
Low effective dimension in models or data: a key to high dimensional inference?
Peter Bickel (Department of Statistics, Berkeley)
Theoretical Analysis seems to suggest that standard problems such as estimating a function of high dimensional variables with noisy data (regression or classification) should be impossible without detailed knowledge or absurdly large amounts of data. Yet, algorithms to perform classification of high dimensional images or other high dimensional objects are remarkably successful. The generally held explanation is the presence of sparsity/ low dimensional structure. I'll discuss and with examples why this may be right.
Friday 23/05/2008
Statistics and politics: incendiary combination or democractic necessity
John Pullinger
John Pullinger, Librarian and Director General of Information Services at the House of Commons and member of the Council of the Royal Statistical Society, will explore how the word statistics came into use as a branch of politics. His talk will outline the history of the interaction between politics and statsitics. This will include the roles played by a number of Prime Ministers and others such as Florence Nightingale. The talk will go on to tell some stories from John's own experienc e as an Executive Director at the Office for National Statistics until 2004 before discussing the dangers and rewards of putting statistics at the heart of the relationship between the citizen and the state.
Estimating parental ancestry from genotype data
Clive Hoggart (Division of Epidemiology, Imperial)
We describe how the proportion of an individual's genome from each of the continental populations can be estimated from the analysis of genotype data.
The method can be used to predict the biogeographical ancestry and physical appearance of a culprit from DNA recovered from the scene of a crime. The method exploits ancestry informative markers (AIMs), genetic loci which exhibit allele frequency differences between populations. We use a panel of AIMs informative for the four main continental populations, subSaharan African, European, East Asian and Native American. With such a dense panel of markers we can estimate theancestry proportions of the two parental gametes separately. Where an individual has ancestry from more than one continental po pulation we can make inference on the number of generations over which admixture has occurred by modelling the stochastic variation in ancestry along the chromosome. We also make inference on th e number of populations contributing to admixture on each gamete by comparison of marginal likelihoods. We demonstrate the method on individuals whose family background is known. Throughout a fully Bayesian perspective is taken. The analyses use the program ADMIXMAP which was originally developed for genetic association studies and admixture mapping.
This is joint work with Paul M. McKeigue (University of Edinburgh).
Friday 02/05/2008
Underground explosion or earthquake: Multivariate discrimination has the answer
Dale N Anderson (Pacific Northwest National Laboratory)
Seismic monitoring for underground nuclear explosions answers three questions for all global seismic activity: Where is the seismic event located? W hat is the event source type (event identification)? If the event is an explosion, what is the yield? The answers to these questions involves processing seismometer waveforms with propagation paths predominately in the mantle. Four discriminants commonly used to identify teleseismic events are depth from travel time, presence of longperiod surface energy (mb vs. MS), depth from reflective phases, and polarity of first motion. The seismic theory for these discriminants is well established in the literature. However the physical basis of each has not been formally integrated into probability models to account for statistical error and provide discriminant calculations appropriate, in general, for multidimensional event identification. This article develops a mathematical statistics formulation of these discriminants and offers a novel approach to multidimensional discrimination that is readily extensible to other discriminants. For each discriminant a probability model is formulated under a general null hypothesis of H0: Explosion Characteristics. The veracity of the hypothesized model is measured with a pvalue calculation that can be filtered to be approximately normally distributed and is in the range [0, 1]. The hypothesis test formulation ensures that seismic phenomenology is tied to the interpretation of the pvalue. These pvalues are then embedded into a multidiscriminant algorithm that is developed from regularized discrimination methods proposed by DiPillo (1976), Smidt and McDonald (1976), and Friedman (1989). Performance of the methods is demonstrated with 102 teleseismic events with magnitudes (mb) ranging from 5 to 6.5.
Learning Curves: Lessons from Statistical Machine Translation
Professor Nello Cristianini (Departments of: Engin eerin g Mathematics and Computer Science, University of Bristol)
We will present an overview of Statistical Machine Translation methods, and a discussion of learning curves in this context.
Friday 14/03/2008
Transfer Talks
James Bentham
Fanyin Zhou
Inmaculada VidanaMarquez
Friday 7/03/2008
Imperial Joint Statistics Seminar  Short Talks
(follow the link for the full program)
Friday 22/02/2008
Transfer Talks
Hung Lu
Theodoros Tsagaris
Zi Yang
Friday 15/02/2008
Classifier ensembles for changing environments
Dr. Ludmila I. Kuncheva (School of Informatics, University of Wales, Bangor)
Classification problems coming from real life are hardly ever static. Class description changes, probability distributions float, new classes appear and old classes disappear, novel technologies enable new complex and more indicative features to be measured. The talk will outline the existing work in this direction. Individual classifier models as well as classifier ensembles will be presented. We will touch upon some of the critical issues of the area including change detection techniques, forgetting strategies and the notorious lack of benchmark data.
Meta analysis on the normal calibration scale
Dr Elena Kulinskaya (Statistical Advisory Service, Imperial)
This talk is about an approach to meta analysis and to statistical evidence developed jointly with Stephan Morgenthaler and Robert Staudte, and now written up in our book 'Meta Analysis: a guide to calibrating and combining statistical evidence', Wiley, February 2008.
The traditional ways of measuring evidence, in particular with pvalues, are neither intuitive nor useful when it comes to making comparisons between experimental results, or when combining them. We measure evidence for an alternative hypothesis, not evidence against a null. To do this, we have in a sense adopted standardized scores for the calibration scale. Evidence for us is simply a transformation of a test statistic S to another one (called evidence T=T(S)) whose distribution is close to normal with variance 1, and whose mean grows from 0 with the parameter as it moves away from the null. Variance stabi lization is used to arrive on this scale. For meta analysis the results from different studies are transformed to a common calibration scale, where it is simpler to combine and interpret them.
I'll provide an introduction and an overview, including some open problems.
Friday 25/01/2008
Dimension Reduction Paradigms for Regression
Professor R. Dennis Cook (School of Statistics, University of Minnesota)
Dimension reduction for regression, represented primarily by principal components, is ubiquitous in the applied sciences. This is an old idea that has moved to a position of prominence in recent years because technological advances now allow scientists to routinely formulate regressions in which the number p of predictors is considerably larger than in the past. Although "large" p regressions are perhaps mainly responsible for renewed interest, dimension reduction methodology can be useful regardless of the size of p. Starting with a little history and a definition of "sufficient reductions", we will consider a variety of models for dimension reduction in regression. The models start from one in which maximum likelihood estimation produces principal components, step along a few incremental expansions, and end with forms that have the potential to improve on some standard methodology. This development provides remedies for two concerns that have dogged principal components in regression: principal components are typically computed from the predictors alone and then do not make apparent use of the response, and they are not equivariant under full rank linear transformation of the predictors.
Friday 30/11/2007
Modelling nonstationary extreme values with application to surfacelevel ozone
Professor Jonathan Tawn (Department of Mathematics and Statistics, Lancaster University)
Statistical methods for modelling extremes of stationary sequences have received much attention. The most common method is to model the rate and size of exceedances of some high constant threshold; the size of exceedances is modelled using a generalised Pareto distribution (GPD). Frequently, data sets display nonstationarity; this is especially common in environmental applications. The ozone data set presented here is an example of such a data set. Surfacelevel ozone levels display complex seasonal patterns and trends due to the mechanisms involved in ozone formation. The standard methods of modelling the extremes of a nonstationary process focus on retaining a constant threshold but using covariate models in the rate and GPD parameters. In this talk an alternative approach will be proposed that uses preprocessing methods to model the nonstationarity in the body of the process and then uses standard methods to model the extremes of the preprocessed data. I will try to justify a claim that the preprocessing method gives a model that better incorporates the underlying mechanisms that generate the process, produces a simpler and more efficient fit and allows easier computation.
Determining parameter redundancy using symbolic algebra
Professor Byron Morgan (Institute of Mathematics, Statistics and Actuarial Science, University of Kent)
A model is parameter redundant if it can be rewritten in terms of a smaller set of parameters. Computer packages for symbolic algebra provide a modern approach to the problem of parameter redundancy, allowing one to determine how many parameters may be estimated using classical inference, and also to identify which combinations of the original parameters are involved. Full rank models, in which all parameters can in principle be estimated, may be classified as essentially or conditio nall y full rank, by means of an extended PLUR matrix decomposition. The parameter redundancy status of models for a given structure and size ma y be extended to models of the same structure and any size by means of expansion theorems, and difficulties with the memory limitations of symbolic algebra packages may be overcome by means of the imaginative use of exhaustive summaries. The link with weak identifiability in Bayesian inference is also mentioned. The approach may be applied in many different areas, and in this talk it is illustrated by a range of examples involving mark recapture recovery data on a number of wild animal species. This talk describes joint research with Ted Catchpole and Diana Cole, who has been supported by the EPSRC.
Friday 16/11/2007
Transfer Talks
Richard Russell
Christoforos Anagnostopoulos
Irfan Sheikh
Friday 09/11/2007
Advances in Consistent Estimation for Tracking
Dr Steven Reece (Pattern Analysis Research Group, Dept. Engineering Science, Oxford University)
The term "consistent estimation" has been used unconventionally by Julier and Uhlmann to describe a highly flexible approach to estimation which uses conservative covariance matrices to capture impurities and missing components in tracker models. The approach uses the Kalman filter as the basic inference engine but is supplemented by techniques such as Covariance Intersection (CI) and Covariance Union (CU). The approach offers solutions to rumour propagation when performing inference in cyclic graphs, inference with incomplete covariance models and multiple hypothesis tracking with unknown assignment probabilities. This talk will review consistent estimation methods and then present recent extensions including Bounded Covariance Inflation and Generalised Covariance Union. The new methods offer more flexible, information efficient generalisations of CI and CU and go some way towards defining a unified theory for consistent estimation.
The approach will be demonstrated on inference problems in cyclic graphs, Decentralised Simultaneous Localisation and Mapping (DSLAM) problems from the robotics domain and decentralised urban fire monitoring.
Modelling Multiple Time Series via Common Factors
Professor Qiwei Yao (Department of Statistics, LSE)
We propose a new method for estimating common factors of multiple time series. One distinctive feature of the new approach is that it is applicable to nonstationary time series. The unobservable (nonstationary) factors are identified via expanding the orthoganal complement of the factor loading space step by step; therefore solving a highdimensional optimization problem by many lowdimensional subproblems. Asymptotic properties of the estimation were investigated. The proposed methodol ogy was illustrated with both simulated and real data sets.
Friday 19/10/2007
Analysis of Stationary, QuasiSeasonal Processes
Dr Emma McCoy (Imperial College London)
Time series which exhibit seasonality with a period that is not fixed, but varies through time are common in a wide range of physical phenomena. This quasiseasonal dependence can be modelled using extensions of the standard ARIMA model that incorporate seasonal persistence. For such data, standard frequency domain based procedures produce biased estimators, even for large sample sizes. This bias can be addressed by considering an alternative frequencydomain likelihood approximation, the form of the likelihood means that its asymptotic and large sample properties and associated maximum likelihood estimators for both the seasonality and degree of persistence can be developed. After outlining the procedure and its properties I will finish by comparing it to more standard procedures in the analysis of simulated data and data from econometric and meteorological applications.
Computing the maximum likelihood estimator of a multidimensional logconcave density
Dr Richard Samworth (Statistical Laboratory, Cambridge)
Abstract: We show that if $X_1,...,X_n$ are a random sample from a logconcave density $f$ in $\mathbb{R}^d$, then with probability one there exists a unique maximum likelihood estimator $\hat{f}_n$ of $f$. The use of this estimator is attractive because, unlike kernel density estimation, the estimator is fully automatic, with no smoothing parameters to choose. The existence proof is nonconstructive, however, and in practice we require an iterative algorithm that converges to the estimator. By reformu lating the problem as one of nondifferentiable convex optimisation, we are able to exhibit such an algorithm. We are also able to extend the methodology to fit finite mixtures of logconcave den sities, yielding a promising technique for clustering and/or classification. The talk will be illustrated with pictures from the R package LogConcDEAD. This is joint work with Madeleine Cule (Cambridge) and Michael Stewart (University of Sydney).