Imperial College London

ProfessorNiallAdams

Faculty of Natural SciencesDepartment of Mathematics

Professor of Statistics
 
 
 
//

Contact

 

+44 (0)20 7594 8837n.adams Website

 
 
//

Location

 

6M55Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@article{Plasse:2019:10.1007/s11222-019-09858-0,
author = {Plasse, J and Adams, N},
doi = {10.1007/s11222-019-09858-0},
journal = {Statistics and Computing},
pages = {1109--1125},
title = {Multiple changepoint detection in categorical data streams},
url = {http://dx.doi.org/10.1007/s11222-019-09858-0},
volume = {29},
year = {2019}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - The need for efficient tools is pressing in the era of big data, particularly in streaming data applications. As data streams are ubiquitous, the ability to accurately detect multiple changepoints, without affecting the continuous flow of data, is an important issue. Change detection for categorical data streams is understudied, and existing work commonly introduces fixed control parameters while providing little insight into how they may be chosen. This is ill-suited to the streaming paradigm, motivating the need for an approach that introduces few parameters which may be set without requiring any prior knowledge of the stream. This paper introduces such a method, which can accurately detect changepoints in categorical data streams with fixed storage and computational requirements. The detector relies on the ability to adaptively monitor the category probabilities of a multinomial distribution, where temporal adaptivity is introduced using forgetting factors. A novel adaptive threshold is also developed which can be computed given a desired false positive rate. This method is then compared to sequential and nonsequential change detectors in a large simulation study which verifies the usefulness of our approach. A real data set consisting of nearly 40 million events from a computer network is also investigated.
AU - Plasse,J
AU - Adams,N
DO - 10.1007/s11222-019-09858-0
EP - 1125
PY - 2019///
SN - 0960-3174
SP - 1109
TI - Multiple changepoint detection in categorical data streams
T2 - Statistics and Computing
UR - http://dx.doi.org/10.1007/s11222-019-09858-0
UR - https://link.springer.com/article/10.1007%2Fs11222-019-09858-0
UR - http://hdl.handle.net/10044/1/67388
VL - 29
ER -