Publications

Rasines DG, Young GA, 2023, Splitting strategies for post-selection inference, Biometrika, Vol: 110, Pages: 597-614, ISSN: 0006-3444

We consider the problem of providing valid inference for a selected parameter in a sparse regression setting. It is well known that classical regression tools can be unreliable in this context because of the bias generated in the selection step. Many approaches have been proposed in recent years to ensure inferential validity. In this article we consider a simple alternative to data splitting based on randomizing the response vector, which allows for higher selection and inferential power than the former, and is applicable with an arbitrary selection rule. We perform a theoretical and empirical comparison of the two methods and derive a central limit theorem for the randomization approach. Our investigations show that the gain in power can be substantial.

Journal article

Rasines DG, Young GA, 2022, Empirical bayes and selective inference, Journal of the Indian Institute of Science, Vol: 102, Pages: 1205-1217, ISSN: 0970-4140

We review the empirical Bayes approach to large-scale inference. In the context of the problem of inference for a high-dimensional normal mean, empirical Bayes methods are advocated as they exhibit risk-reducing shrinkage, while establishing appropriate control of frequentist properties of the inference. We elucidate these frequentist properties and evaluate the protection that empirical Bayes provides against selection bias.

Journal article

Rasines DG, Young GA, 2022, Bayesian selective inference, Handbook of Statistics, Pages: 43-65, ISBN: 9780323952682

We discuss Bayesian inference for parameters selected using the data. First, we provide a critical analysis of the existing positions in the literature regarding the correct Bayesian approach under selection. Second, we discuss two types of noninformative prior for selection models. These priors may be employed to produce a posterior distribution in the absence of prior information, as well as to provide well-calibrated frequentist inference for the selected parameter. We illustrate the proposed priors empirically through several examples.

Abstract
Cite

Book chapter

Kuffner TA, Lee SMS, Young GA, 2021, Block bootstrap optimality and empirical block selection for sample quantiles with dependent data, BIOMETRIKA, Vol: 108, Pages: 675-692, ISSN: 0006-3444

Journal article

Young GA, 2020, High‐dimensional statistics: a non‐asymptotic viewpoint, Martin J. Wainwright, Cambridge University Press, 2019, xvii 552 pages, £57.99, hardback ISBN: 978‐1‐1084‐9802‐9, International Statistical Review, Vol: 88, Pages: 258-261, ISSN: 0306-7734

Journal article

Young GA, Lee S, Kuffner T, 2018, Consistency of a hybrid block bootstrap for distribution and varianceestimation for sample quantiles of weakly dependent sequences, Australian and New Zealand Journal of Statistics, Vol: 60, Pages: 103-114, ISSN: 1369-1473

Consistency and optimality of block bootstrap schemes for distribution and variance estimation of smooth functionals of dependent data have been thoroughly investigated by Hall, Horowitz & Jing (1995), among others. However, for nonsmooth functionals, such as quantiles, much less is known. Existing results, due to Sun & Lahiri (2006), regarding strong consistency for distribution and variance estimation via the moving block bootstrap (MBB) require that b→∞, where b=⌊n/ℓ⌋ is the number of resampled blocks to be pasted together to form the bootstrap data series, n is the available sample size, and ℓ is the block length. Here we show that, in fact, weak consistency holds for any b such that 1≤b=O(n/ℓ). In other words we show that a hybrid between the subsampling bootstrap (b=1) and MBB is consistent. Empirical results illustrate the performance of hybrid block bootstrap estimators for varying numbers of blocks.

Journal article

Kuffner TA, Young A, 2018, Principled Statistical Inference in Data Science, Conference on Statistical Data Science, Publisher: WORLD SCIENTIFIC PUBL CO PTE LTD, Pages: 21-36

Author Web Link
Cite
Citations: 3

Conference paper

DiCiccio TJ, Kuffner TA, Young GA, 2017, A simple analysis of the exact probability matching prior in the location-scale model, The American statistician, Vol: 71, Pages: 302-304, ISSN: 1537-2731

It has long been asserted that in univariate location-scale models, when concerned with inference for either the location or scale parameter, the use of the inverse of the scale parameter as a Bayesian prior yields posterior credible sets which have exactly the correct frequentist confidence set interpretation. This claim dates to at least Peers (1965), and has subsequently been noted by various authors, with varying degrees of justification. We present a simple, direct demonstration of the exact matching property of the posterior credible sets derived under use of this prior in the univariate location-scale model. This is done by establishing an equivalence between the conditional frequentist and posterior densities of the pivotal quantities on which conditional frequentist inferences are based.

Journal article

Young GA, Kuffner TA, DiCiccio TJ, 2017, The formal relationship between analytic and bootstrap approaches to parametric inference, Journal of Statistical Planning and Inference, Vol: 191, Pages: 81-87, ISSN: 0378-3758

Two routes most commonly proposed for accurate inference on a scalar interest parameter in the presence of a (possibly high-dimensional) nuisance parameter are parametric simulation (‘bootstrap’) methods, and analytic procedures based on normal approximation to adjusted forms of the signed root likelihood ratio statistic. Under some null hypothesis of interest, both methods yield p-values which are uniformly distributed to error of third-order in the available sample size. But, given a specific dataset, what is the formal relationship between p-values calculated by the two approaches? We show that the two methodologies give the same inference to second order in general: the analytic p-value calculated from a dataset will agree with the bootstrap p-value constructed from that same dataset to O(n−1), where n is the sample size. In practice, the agreement is often startling.

Journal article

Young GA, Lee SMS, 2016, Distribution of likelihood-based p-values under a local alternative hypothesis, Biometrika, Vol: 103, Pages: 641-652, ISSN: 1464-3510

We consider inference on a scalar parameter of interest in the presence of a nuisance parameter, using a likelihood-based statistic which is asymptotically normally distributed under the null hypothesis. Higher-order expansions are used to compare the repeated sampling distribution, under a general contiguous alternative hypothesis, of pp-values calculated from the asymptotic normal approximation to the null sampling distribution of the statistic with the distribution of pp-values calculated by bootstrap approximations. The results of comparisons in terms of power of different testing procedures under an alternative hypothesis are closely related to differences under the null hypothesis, specifically the extent to which testing procedures are conservative or liberal under the null. Empirical examples are given which demonstrate that higher-order asymptotic effects may be seen clearly in small-sample contexts.

Journal article

Young GA, 2015, Introduction to High-dimensional Statistics, International Statistical Review, Vol: 83, Pages: 515-516, ISSN: 1751-5823

Journal article

Young GA, Montana G, Ruan D, 2015, Differential analysis of biological networks, BMC Bioinformatics, Vol: 16, ISSN: 1471-2105

BackgroundIn cancer research, the comparison of gene expression or DNA methylation networks inferred from healthy controls and patients can lead to the discovery of biological pathways associated to the disease. As a cancer progresses, its signalling and control networks are subject to some degree of localised re-wiring. Being able to detect disrupted interaction patterns induced by the presence or progression of the disease can lead to the discovery of novel molecular diagnostic and prognostic signatures. Currently there is a lack of scalable statistical procedures for two-network comparisons aimed at detecting localised topological differences.ResultsWe propose the dGHD algorithm, a methodology for detecting differential interaction patterns in two-network comparisons. The algorithm relies on a statistic, the Generalised Hamming Distance (GHD), for assessing the degree of topological difference between networks and evaluating its statistical significance. dGHD builds on a non-parametric permutation testing framework but achieves computationally efficiency through an asymptotic normal approximation.ConclusionsWe show that the GHD is able to detect more subtle topological differences compared to a standard Hamming distance between networks. This results in the dGHD algorithm achieving high performance in simulation studies as measured by sensitivity and specificity. An application to the problem of detecting differential DNA co-methylation subnetworks associated to ovarian cancer demonstrates the potential benefits of the proposed methodology for discovering network-derived biomarkers associated with a trait of interest.

Journal article

DiCiccio TJ, Kuffner TA, Young GA, Zaretzki Ret al., 2015, Stability and uniqueness of p-values for likelihood-based inference, Statistica Sinica, Vol: 25, Pages: 1355-1376, ISSN: 1017-0405

Likelihood-based methods of statistical inference provide a useful general methodology that is appealing, as a straightforward asymptotic theory can be applied for their implementation. It is important to assess the relationships between different likelihood-based inferential procedures in terms of accuracy and adherence to key principles of statistical inference, in particular those relating to conditioning on relevant ancillary statistics. An analysis is given of the stability properties of a general class of likelihood-based statistics, including those derived from forms of adjusted profile likelihood, and comparisons are made between inferences derived from different statistics. In particular, we derive a set of sufficient conditions for agreement to Op(n-1), in terms of the sample size n, of inferences, specifically p-values, derived from different asymptotically standard normal pivots. Our analysis includes inference problems concerning a scalar or vector interest parameter, in the presence of a nuisance parameter.

Journal article

DiCiccio TJ, Kuffner TA, Young GA, 2015, Quantifying nuisance parameter effects via decompositions of asymptotic refinements for likelihood-based statistics, Journal of Statistical Planning and Inference, Vol: 165, Pages: 1-12, ISSN: 1873-1171

Accurate inference on a scalar interest parameter in the presence of a nuisance parameter may be obtained using an adjusted version of the signed root likelihood ratio statistic, in particular Barndorff-Nielsen’s R∗ statistic. The adjustment made by this statistic may be decomposed into a sum of two terms, interpreted as correcting respectively for the possible effect of nuisance parameters and the deviation from standard normality of the signed root likelihood ratio statistic itself. We show that the adjustment terms are determined to second-order in the sample size by their means. Explicit expressions are obtained for the leading terms in asymptotic expansions of these means. These are easily calculated, allowing a simple way of quantifying and interpreting the respective effects of the two adjustments, in particular of the effect of a high dimensional nuisance parameter. Illustrations are given for a number of examples, which provide theoretical insight to the effect of nuisance parameters on parametric inference. The analysis provides a decomposition of the mean of the signed root statistic involving two terms: the first has the property of taking the same value whether there are no nuisance parameters or whether there is an orthogonal nuisance parameter, while the second is zero when there are no nuisance parameters. Similar decompositions are discussed for the Bartlett correction factor of the likelihood ratio statistic, and for other asymptotically standard normal pivots.

Journal article

DiCiccio TJ, Kuffner TA, Young GA, 2012, Objective Bayes, conditional inference and the signed root likelihood ratio statistic, Biometrika, Vol: 99, Pages: 675-686

Cite

Journal article

Lu K, Young GA, 2012, Parametric bootstrap under model mis-specification, Computational Statistics and Data Analysis, Vol: 56, Pages: 2410-2420

Under model correctness, highly accurate inference on a scalar interest parameter inthe presence of a nuisance parameter can be achieved by several routes, among themconsidering the bootstrap distribution of the signed root likelihood ratio statistic. Thecontext of model mis-specification is considered and inference based on a robust formof the signed root statistic is discussed in detail. Stability of the distribution of thestatistic allows accurate inference, outperforming that based on first-order asymptoticapproximation, by considering the bootstrap distribution of the statistic under theincorrectly assumed distribution. Comparisons of this simple approach with alternativeanalytic and non-parametric inference schemes are discussed.

Journal article

DiCiccio TJ, Young GA, 2011, Conditional inference by estimation of a marginal distribution., Selected Works of Debabrata Basu, Editors: DasGupta, Publisher: Springer Verlag, Pages: 9-14, ISBN: 9781441958242

This book contains a little more than 20 of Debabrata Basu's& most significant articles and writings.

Abstract
Cite

Book chapter

Diciccio TJ, Young GA, 2010, Objective Bayes and conditional inference in exponential families, BIOMETRIKA, Vol: 97, Pages: 497-504, ISSN: 0006-3444

Author Web Link
Cite
Citations: 2

Journal article

Young GA, DiCiccio TJ, 2010, Computer-intensive Statistical Inference, Complex Data Modeling and Computationally Intensive Statistical Methods, Editors: Mantovan, Secchi, Publisher: Springer, Pages: 137-150, ISBN: 9788847013858

This volume contains 20 selected papers among those presented at the conference "S.Co.2009: Complex data modeling and computationally intensive methods for ...

Abstract
Cite

Book chapter

Young GA, 2009, ROUTES TO HIGHER-ORDER ACCURACY IN PARAMETRIC INFERENCE, AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Vol: 51, Pages: 115-126, ISSN: 1369-1473

Author Web Link
Cite
Citations: 15

Journal article

DiCiccio TJ, Young GA, 2008, Conditional properties of unconditional parametric bootstrap procedures for inference in exponential families, Biometrika, Vol: 95, Pages: 747-758, ISSN: 0006-3444

Journal article

Cheung KY, Lee SMS, Young GA, 2006, Stein confidence sets based on non-iterated and iterated parametric bootstraps, STATISTICA SINICA, Vol: 16, Pages: 45-75, ISSN: 1017-0405

Author Web Link
Cite
Citations: 3

Journal article

DiCiccio TJ, Monti AC, Young GA, 2006, Variance stabilization for a scalar parameter, JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, Vol: 68, Pages: 281-303, ISSN: 1369-7412

Author Web Link
Cite
Citations: 7

Journal article

Cheung KY, Lee SMS, Young GA, 2005, Iterating the m out of n bootstrap in nonregular smooth function models, STATISTICA SINICA, Vol: 15, Pages: 945-967, ISSN: 1017-0405

Author Web Link
Cite
Citations: 6

Journal article

Young GA, Smith RL, 2005, Essentials of statistical inference, Cambridge, Publisher: Cambridge University Press, ISBN: 9780521839716

Cite

Book

Lee SMS, Young GA, 2005, Parametric bootstrapping with nuisance parameters, Statistics and Probability Letters, Vol: 71, Pages: 143-153, ISSN: 0167-7152

Cite

Journal article

Young GA, 2003, Better bootstrapping by constrained prepivoting, Metron, Vol: 61, Pages: 227-242, ISSN: 0026-1424

Bootstrap methods are attractive empirical procedures for assessment of errors in problems of statistical estimation, and allow highly accurate inference in a vast range of problems. Conventional bootstrapping involves sampling from the empirical distribution function in nonparametric problems, or a fitted parametric model in parametric inference. Recently, much attention has been focussed on methods for reduction of the error properties of bootstrap procedures, by systematic modification of the sampling model, in a way that is dependent on the parameter of interest. In this paper, we provide a general perspective on the bootstrap, based on the notion of prepivoting, with the specific aim of synthesizing recent developments related to modified, or "weighted", bootstrap procedures, and provide a critical evaluation of the practical benefits of such procedures over conventional bootstrap schemes and alternative analytic methods.

Abstract
Cite

Journal article

Robinson J, Ronchetti E, Young GA, 2003, Saddlepoint approximations and tests based on multivariate M-estimates, ANNALS OF STATISTICS, Vol: 31, Pages: 1154-1169, ISSN: 0090-5364

Author Web Link
Cite
Citations: 32

Journal article

Lee SMS, Young GA, 2003, Prepivoting by weighted bootstrap iteration, BIOMETRIKA, Vol: 90, Pages: 393-410, ISSN: 0006-3444

Author Web Link
Cite
Citations: 16

Journal article

Davison AC, Hinkley DV, Young GA, 2003, Recent developments in bootstrap methodology, STATISTICAL SCIENCE, Vol: 18, Pages: 141-157, ISSN: 0883-4237

Author Web Link
Cite
Citations: 382

Journal article

Professor Alastair Young

Contact

Location

Summary