Most of the members of this group are from the Statistics Section and Biomaths research group of the Department of Mathematics. Below you can find a list of research areas that members of this group are currently working on and/or would like to work on by applying their developed mathematical and statistical methods.

Research areas

Research areas


Publications

Citation

BibTex format

@article{Cox:2017,
author = {Cox, DR and Battey, HS},
journal = {Proceedings of the National Academy of Sciences of the United States of America},
pages = {8592--8595},
title = {Large numbers of explanatory variables, a semi-descriptive analysis},
url = {http://hdl.handle.net/10044/1/49830},
volume = {114},
year = {2017}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - Data with a relatively small number of study individuals and a very large number of potential explanatory features arise particularly, but by no means only, in genomics. A powerful method of analysis, the lasso [Tibshirani R (1996) J Roy Stat Soc B 58:267–288], takes account of an assumed sparsity of effects, that is, that most of the features are nugatory. Standard criteria for model fitting, such as the method of least squares, are modified by imposing a penalty for each explanatory variable used. There results a single model, leaving open the possibility that other sparse choices of explanatory features fit virtually equally well. The method suggested in this paper aims to specify simple models that are essentially equally effective, leaving detailed interpretation to the specifics of the particular study. The method hinges on the ability to make initially a very large number of separate analyses, allowing each explanatory feature to be assessed in combination with many other such features. Further stages allow the assessment of more complex patterns such as nonlinear and interactive dependences. The method has formal similarities to so-called partially balanced incomplete block designs introduced 80 years ago [Yates F (1936) J Agric Sci 26:424–455] for the study of large-scale plant breeding trials. The emphasis in this paper is strongly on exploratory analysis; the more formal statistical properties obtained under idealized assumptions will be reported separately.
AU - Cox,DR
AU - Battey,HS
EP - 8595
PY - 2017///
SP - 8592
TI - Large numbers of explanatory variables, a semi-descriptive analysis
T2 - Proceedings of the National Academy of Sciences of the United States of America
UR - http://hdl.handle.net/10044/1/49830
VL - 114
ER -