Mr Mike Brookes

Faculty of Engineering, Department of Electrical and Electronic Engineering

Emeritus Reader

Contact

+44 (0)20 7594 6165mike.brookes Website

Assistant

Miss Vanessa Rodriguez-Gonzalez +44 (0)20 7594 6267

Location

807aElectrical EngineeringSouth Kensington Campus

Summary

Publications

Gonzalez S, Brookes M, 2014, PEFAC - a pitch estimation algorithm robust to high levels of noise, IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol: 22, Pages: 518-530, ISSN: 2329-9290

We present PEFAC, a fundamental frequency estimation algorithm for speech that is able to identify voiced frames and estimate pitch reliably even at negative signal-to-noise ratios. The algorithm combines a normalization stage, to remove channel dependency and to attenuate strong noise components, with a harmonic summing filter applied in the log-frequency power spectral domain, the impulse response of which is chosen to sum the energy of the fundamental frequency harmonics while attenuating smoothly-varying noise components. Temporal continuity constraints are applied to the selected pitch candidates and a voiced speech probability is computed from the likelihood ratio of two classifiers, one for voiced speech and one for unvoiced speech/silence. We compare the performance of our algorithm with that of other widely used algorithms and demonstrate that it performs well in both high and low levels of additive noise.

Journal article

Gilliam C, Dragotti P-L, Brookes M, 2014, On the Spectrum of the Plenoptic Function, IEEE TRANSACTIONS ON IMAGE PROCESSING, Vol: 23, Pages: 502-516, ISSN: 1057-7149

Author Web Link
Cite
Citations: 36

Journal article

Hilkhuysen G, Gaubitch N, Brookes M, Huckvale Met al., 2014, Effects of noise suppression on intelligibility. II: An attempt to validate physical metrics, JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, Vol: 135, Pages: 439-450, ISSN: 0001-4966

Author Web Link
Cite
Citations: 5

Journal article

Pearson J, Visentini-Scarzanella M, Brookes M, Dragotti PLet al., 2014, TILTED LAYER-BASED MODELING FOR ENHANCED LIGHT-FIELD PROCESSING AND IMAGE BASED RENDERING, IEEE International Conference on Image Processing (ICIP), Publisher: IEEE, Pages: 1917-1921, ISSN: 1522-4880

Conference paper

Jones Z, Brookes M, Dragotti PL, Benton Det al., 2014, WIDE-BASELINE IMAGE CHANGE DETECTION, IEEE International Conference on Image Processing (ICIP), Publisher: IEEE, Pages: 1589-1593, ISSN: 1522-4880

Conference paper

Stanton R, Brookes M, 2014, PATH UNCERTAINTY ROBUST BEAMFORMING, 22nd European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 1925-1929, ISSN: 2076-1465

Conference paper

Wang Y, Brookes M, 2014, SPEECH ENHANCEMENT USING A MODULATION DOMAIN KALMAN FILTER POST-PROCESSOR WITH A GAUSSIAN MIXTURE NOISE MODEL, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, ISSN: 1520-6149

Conference paper

Gonzalez S, Brookes M, 2014, MASK-BASED ENHANCEMENT FOR VERY LOW QUALITY SPEECH, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, ISSN: 1520-6149

Author Web Link
Cite
Citations: 3

Conference paper

Moore AH, Brookes M, Naylor PA, 2013, Roomprints for forensic audio applications, Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Publisher: IEEE

A roomprint is a quantifiable description of an acoustic environment which can be measured under controlled conditions and estimated from a monophonic recording made in that space. We here identify the properties required of a roomprint in forensic audio applications and review the observable characteristics of a room that, when extracted from recordings, could form the basis of a roomprint. Frequency-dependent reverberation time is investigated as a promising characteristic and used in a room identification experiment giving correct identification in 96% of trials.

Conference paper

Gaubitch N, Brookes M, Naylor P, 2013, Blind Channel Magnitude Response Estimation in Speech using Spectrum Classification, IEEE Transactions on Audio, Speech, and Language Processing, Vol: 21, Pages: 2162-2171, ISSN: 1558-7916

Cite

Journal article

Moore AH, Brookes M, Naylor PA, 2013, Room geometry estimation from a single channel acoustic impulse response, Proc. European Signal Processing Conference (EUSIPCO)

Conference paper

Pearson J, Brookes M, Dragotti P-L, 2013, Plenoptic layer-based modelling for image based rendering, IEEE Transactions on Image Processing, Vol: 22, Pages: 3405-3419, ISSN: 1057-7149

Cite

Journal article

Eaton D, Brookes DM, Naylor PA, 2013, A Comparison of Non-Intrusive SNR Estimation Algorithms and the Use of Mapping Functions, EUSIPCO, Publisher: EURASIP, Pages: 1-5

We present a comparative evaluation of six methods for non-intrusive Signal-to-Noise Ratio (SNR) estimation for narrowband speech in noise. We demonstrate that the performance of all methods can be improved by applying a non-linear mapping function to their estimates of SNR. We have employed phrases built from the TIMIT speech corpus and noises from a broad range of sources including ITU-T P.501, NOISEX-92, and Soundjay. We compare the accuracy of the methods in estimating the SNR of both stationary and non-stationary noise and we conclude that with the mapping function, the best current methods can estimate the SNR to within approximately 3.5 dB for SNRs from -5 dB to 35 dB.

Conference paper

Gilliam C, Brookes M, Dragotti PL, 2013, Image-Based Rendering and the Sampling of the Plenoptic Function, Emerging Technologies for 3D Video: Creation, Coding, Transmission and Rendering, Pages: 231-248, ISBN: 9781118355114

Image-based rendering (IBR) is a technique for producing arbitrary views of a scene using multiple images instead of exact object models. The central concept is that each image comprises a collection of light rays and a new view is interpolated from these light rays. If we modelled the light rays using a seven-dimensional function, known as the plenoptic function, then IBR can be viewed in terms of sampling and reconstruction. Therefore the important goal of minimizing the number of images required in IBR, whilst maintaining rendering quality, can be examined through sampling analysis of the plenoptic function. In this context, the chapter examines the state of the art in plenoptic sampling theory. It focuses on both uniform and adaptive sampling of the plenoptic function. In particular, it presents theoretical results for uniform sampling based on spectral analysis of the plenoptic function and algorithms for adaptive plenoptic sampling. © 2013 by John Wiley & Sons, Ltd.

Abstract
Cite
Citations: 1

Book chapter

Sharma D, Naylor PA, Brookes M, 2013, NON-INTRUSIVE SPEECH INTELLIGIBILITY ASSESSMENT, 21st European Signal Processing Conference (EUSIPCO), Publisher: IEEE

Author Web Link
Cite
Citations: 4

Conference paper

Wang Y, Brookes M, 2013, A SUBSPACE METHOD FOR SPEECH ENHANCEMENT IN THE MODULATION DOMAIN, 21st European Signal Processing Conference (EUSIPCO), Publisher: IEEE

Conference paper

Gonzalez S, Brookes M, 2013, SPEECH ACTIVE LEVEL ESTIMATION IN NOISY CONDITIONS, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 6684-6688, ISSN: 1520-6149

Conference paper

Wang Y, Brookes M, 2013, SPEECH ENHANCEMENT USING A ROBUST KALMAN FILTER POST-PROCESSOR IN THE MODULATION DOMAIN, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 7457-7461, ISSN: 1520-6149

Author Web Link
Cite
Citations: 6

Conference paper

Gaubitch ND, Löllmann HW, Jeub M, Falk TH, Naylor PA, Vary P, Brookes Met al., 2012, Performance comparison of algorithms for blind reverberation time estimation from speech

The reverberation time, T60, is one of the key parameters used to quantify room acoustics. It can provide information about the quality and intelligibility of speech recorded in a reverberant environment, and it can be used to increase robustness to reverberation of speech processing algorithms. T60 can be determined directly from a measurement of the acoustic impulse response, but in situations where this is unavailable it must be estimated blindly from reverberant speech. In this contribution, we provide a study of three state-of-the-art methods for blind T60 estimation. Experimental results with a large number of talkers, simulated and measured acoustic impulse responses, and various levels of additive white Gaussian noise are presented. The relative merits of the three methods in terms of computational time, estimation accuracy, noise sensitivity and inter-talker variance are discussed. In general, all three methods are able to estimate the reverberation time to within 0.2 s for T60 ≤ 0.8 s and SNR ≥ 30 dB, while increasing the noise level causes overestimation. The relative computational speed of the three methods is also assessed.

Abstract
Cite
Citations: 47

Conference paper

Hilkhuysen G, Gaubitch N, Brookes M, Huckvale Met al., 2012, Effects of noise suppression on intelligibility: dependency on signal-to-noise ratios, Journal of the Acoustical Society of America, Vol: 131, Pages: 531-539

Cite

Journal article

Sharma D, Hilkhuysen G, Naylor PA, Gaubitch ND, Huckvale M, Brookes Met al., 2012, Descriptive Vocabulary Development for Degraded Speech, 13th Annual Conference of the International-Speech-Communication-Association, Publisher: ISCA-INT SPEECH COMMUNICATION ASSOC, Pages: 1494-1497

Conference paper

Gonzalez S, Brookes M, 2012, SIBILANT SPEECH DETECTION IN NOISE, 13th Annual Conference of the International-Speech-Communication-Association, Publisher: ISCA-INT SPEECH COMMUNICATION ASSOC, Pages: 1486-1489

Conference paper

Gilliam C, Pearson J, Brookes M, Dragotti PLet al., 2012, IMAGE BASED RENDERING WITH DEPTH CAMERAS: HOW MANY ARE NEEDED?, IEEE International Conference on Acoustics, Speech and Signal Processing, Publisher: IEEE, Pages: 5437-5440, ISSN: 1520-6149

Conference paper

Sharma D, Naylor PA, Gaubitch ND, Brookes Met al., 2012, NON INTRUSIVE CODEC IDENTIFICATION ALGORITHM, IEEE International Conference on Acoustics, Speech and Signal Processing, Publisher: IEEE, Pages: 4477-4480, ISSN: 1520-6149

Author Web Link
Cite
Citations: 7

Conference paper

Chorti A, Brookes M, 2011, On the Effect of Voigt Profile Oscillators on OFDM Systems, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, Vol: 58, Pages: 768-772, ISSN: 1549-7747

Author Web Link
Cite
Citations: 1

Journal article

Gaubitch ND, Brookes M, Naylor PA, Sharma Det al., 2011, Bayesian Adaptive method for estimating Speech Intelligibility in noise, Pages: 169-174

We present the Bayesian Adaptive Speech Intelligibility Estimation (BASIE) method - a tool for rapid estimation of a given speech reception threshold (SRT) and the slope at that threshold of multiple psychometric functions for speech intelligibility in noise. The core of this tool is an adaptive Bayesian procedure, which adjusts the signal-to-noise ratio at each subsequent stimulus such that the expected variance of the threshold and slope estimates are minimised. Simulation results show that the algorithm is able to achieve SRT estimates accurate to within ±1 dB in under 30 iterations. Furthermore, we discuss strategies for using BASIE to evaluate the effects of speech processing algorithms on intelligibility and we give two illustrative examples for different noise reduction methods with supporting listening experiments.

Abstract
Cite
Citations: 1

Conference paper

Sharma D, Hilkhuysen G, Gaubitch ND, Brookes M, Naylor PAet al., 2011, C-Qual - A validation of PESQ using degradations encountered in forensic and law enforcement audio, Pages: 177-181

Assessment of speech quality of law-enforcement audio recordings is important as degradations introduced by non-ideal recording conditions can reduce the intelligence value of such recordings. Furthermore a model that predicts speech quality could be beneficial for assessing the performance of audio collection and enhancement systems. The Perceptual Evaluation of Speech Quality (PESQ) algorithm (ITU-T P.862) has been validated for degradations common in telecommunications. In this paper we apply PESQ to degradations typically encountered in law-enforcement. Also we present a subjectively labeled database (C-Qual) containing distortions encountered in law enforcement scenarios. Comparing the prediction by PESQ and the observed opinions provided by the listeners shows that PESQ is less suitable for estimating the speech quality in this context.

Abstract
Cite
Citations: 1

Conference paper

Chorti A, Brookes M, 2011, Performance analysis of OFDM and DAB receivers in the presence of spurious tones, TELECOMMUNICATION SYSTEMS, Vol: 46, Pages: 181-190, ISSN: 1018-4864

Author Web Link
Cite
Citations: 4

Journal article

Bai J, Brookes M, 2011, ADAPTIVE HIDDEN MARKOV MODELS FOR NOISE MODELLING, 19th European Signal Processing Conference (EUSIPCO), Publisher: EUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP, Pages: 2324-2328, ISSN: 2076-1465

Conference paper

Gonzalez S, Brookes M, 2011, A PITCH ESTIMATION FILTER ROBUST TO HIGH LEVELS OF NOISE (PEFAC), 19th European Signal Processing Conference (EUSIPCO), Publisher: EUROPEAN ASSOC SIGNAL SPEECH & IMAGE PROCESSING-EURASIP, Pages: 451-455, ISSN: 2076-1465

Author Web Link
Cite
Citations: 46

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: id=00000744&limit=30&person=true&page=3&respub-action=search.html