Imperial College London

Patrick A. Naylor

Faculty of EngineeringDepartment of Electrical and Electronic Engineering

Professor of Speech & Acoustic Signal Processing
 
 
 
//

Contact

 

+44 (0)20 7594 6235p.naylor Website

 
 
//

Location

 

803Electrical EngineeringSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

336 results found

Eaton J, Naylor PA, 2015, Reverberation time estimation on the ACE corpus using the SDD method, arXiv, ACE Challenge Workshop, a satellite event of IEEE-WASPAA, Publisher: arXiv

Reverberation Time ($T_60$) is an important measure for characterizing the properties of a room. The author’s $T_60$ estimation algorithm was previously tested on simulated data where the noise is artificially added to the speech after convolution with a impulse responses simulated using the image method. We test the algorithm on speech convolved with real recorded impulse responses and noise from the same rooms from the Acoustic Characterization of Environments (ACE) corpus and achieve results comparable results to those using simulated data.

Conference paper

Moore AH, Evers C, Naylor PA, Alon DL, Rafaely Bet al., 2015, Direction of arrival estimation using pseudo-intensity vectors with direct-path dominance test, European Signal Processing Conference, Publisher: IEEE, Pages: 2296-2300, ISSN: 2219-5491

The accuracy of direction of arrival estimation tends to degrade under reverberant conditions due to the presence of reflected signal components which are correlated with the direct path. The recently proposed direct-path dominance test provides a means of identifying time-frequency regions in which a single signal path is dominant. By analysing only these regions it was shown that the accuracy of the FS-MUSIC algorithm could be significantly improved. However, for real-time implementation a less computationally demanding localisation algorithm would be preferable. In the present contribution we investigate the direct-path dominance test as a preprocessing step to pseudo-intensity vector-based localisation. A novel formulation of the pseudo-intensity vector is proposed which further exploits the direct path dominance test and leads to improved localisation performance.

Conference paper

Doire C, Brookes D, Naylor P, Jensen SHet al., 2015, Data-Driven Statistical Modelling of Room Impulse Responses in the Power Domain, European Signal Processing Conference (EUSIPCO), Publisher: IEEE

Having an accurate statistical model of room impulse responses with a minimum number of parameters is of crucial importance in applications such as dereverberation. In this paper, by taking into account the behaviour of the early reflections, we extend the widely-used statistical model proposed by Polack. The squared room impulse response is modelled in each frequency band as the realisation of a stochastic process weighted by the sum of two exponential decays. Room-independent values for the new parameters involved are obtained through analysis of several room impulse response databases, and validation of the model in the likelihood sense is performed.

Conference paper

Hu M, Doclo S, Sharma D, Brookes D, Naylor Pet al., 2015, Noise Robust Blind System Identification Algorithms Based On A Rayleigh Quotient Cost Function, European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 2476-2480

An important prerequisite for acoustic multi-channel equalization for speech dereverberation involves the identification of the acoustic channels between the source and the microphones. Blind System Identification (BSI) algorithms based on cross-relation error minimization are known to mis-converge in the presence of noise. Although algorithms have been proposed in the literature to improve robustness to noise, the estimated room impulse responses are usually constrained to have a flat magnitude spectrum. In this paper, noise robust algorithms based on a Rayleigh quotient cost function are proposed. Unlike the traditional algorithms, the estimated impulse responses are not always forced to have unit norm. Experimental results using simulated room impulse responses and several SNRs show that one of the proposed algorithms outperforms competing algorithms in terms of normalized projection misalignment.

Conference paper

Moore AH, Evers C, Naylor PA, 2015, Multichannel equalisation for high-order spherical microphone arrays using beamformed channels, 2015 IEEE International Conference on Digital Signal Processing (DSP), Publisher: IEEE, Pages: 1211-1215, ISSN: 1546-1874

High-order spherical microphone arrays offer many practical benefits including relatively fine spatial resolution in all directions and rotation invariant processing using eigenbeams. Spatial filtering can reduce interference from noise and reverberation but in even moderately reverberant environments the beam pattern fails to suppress reverberation to a level adequate for typical applications. In this paper we investigate the feasibility of applying dereverberation by considering multiple beamformer outputs as channels to be dereverberated. In one realisation we process directly in the spherical harmonic domain where the beampatterns are mutually orthogonal. In a second realisation, which is not limited to spherical microphone arrays, beams are pointed in the direction of dominant reflections. Simulations demonstrate that in both cases reverberation is significantly reduced and, in the best case, clarity index is improved by 15 dB.

Conference paper

Evers C, Moore AH, Naylor PA, Sheaffer J, Rafaely Bet al., 2015, Bearing-only acoustic tracking of moving speakers for robot audition, 2015 IEEE International Conference on Digital Signal Processing (DSP), Publisher: IEEE, Pages: 1206-1210, ISSN: 1546-1874

This paper focuses on speaker tracking in robot audition for human-robot interaction. Using only acoustic signals, speaker tracking in enclosed spaces is subject to missing detections and spurious clutter measurements due to speech inactivity, reverberation and interference. Furthermore, many acoustic localization approaches estimate speaker direction, hence providing bearing-only measurements without range information. This paper presents a probability hypothesis density (PHD) tracker that augments the bearing-only speaker directions of arrival with a cloud of range hypotheses at speaker initiation and propagates the random variates through time. Furthermore, due to their formulation PHD filters explicitly model, and hence provide robustness against, clutter and missing detections. The approach is verified using experimental results.

Conference paper

Parada PP, Sharma D, Naylor PA, van Waterschoot Tet al., 2015, Reverberant speech recognition exploiting clarity index estimation, Eurasip Journal on Advances in Signal Processing, Vol: 2016, Pages: 1-12, ISSN: 1687-6180

We present single-channel approaches to robust automatic speech recognition (ASR) in reverberant environments based on non-intrusive estimation of the clarity index (C 50). Our best performing method includes the estimated value of C 50 in the ASR feature vector and also uses C 50 to select the most suitable ASR acoustic model according to the reverberation level. We evaluate our method on the REVERB Challenge database employing two different C 50 estimators and show that our method outperforms the best baseline of the challenge achieved without unsupervised acoustic model adaptation, i.e. using multi-condition hidden Markov models (HMMs). Our approach achieves a 22.4 % relative word error rate reduction in comparison to the best baseline of the challenge.

Journal article

Eaton J, Moore AH, Naylor PA, Skoglund Jet al., 2015, Direct-to-reverberant ratio estimation using a null-steered beamformer, ICASSP, Publisher: IEEE, Pages: 46-50

Reverberation affects the quality and intelligibility of distant speech recorded in a room. Direct-to-Reverberant Ratio (DRR) is a useful measure for assessing the acoustic configuration and can be used to inform dereverberation algorithms. We describe a novel DRR estimation algorithm applicable where the signal was recorded with two or more microphones, such as mobile communications devices and laptops. The method uses a null-steered beamformer. In simulations the proposed method yields accurate DRR estimates to within +/- 4 dB across a across a wide variety of room sizes, reverberation times and source-receiver distances. It is also shown that the proposed method is more robust to background noise than a baseline approach. The best estimation accuracy is obtained in the region from -5 to 5 dB which is a relevant range for portable devices.

Conference paper

Zahedi A, Ostergaard J, Jensen SH, Bech S, Naylor Pet al., 2015, Audio coding in wireless acoustic sensor networks, SIGNAL PROCESSING, Vol: 107, Pages: 141-152, ISSN: 0165-1684

Journal article

Hu M, sharma D, Doclo S, Brookes D, naylor Pet al., Speaker change detection and speaker diarization using spatial information, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Conference paper

Cauchi B, Naylor PA, Gerkmann T, Doclo S, Goetze Set al., 2015, LATE REVERBERANT SPECTRAL VARIANCE ESTIMATION USING ACOUSTIC CHANNEL EQUALIZATION, 23rd European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 2481-2485, ISSN: 2076-1465

Conference paper

Javed HA, Naylor PA, 2015, AN EXTENDED REVERBERATION DECAY TAIL METRIC AS A MEASURE OF PERCEIVED LATE REVERBERATION, 23rd European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 1063-1067, ISSN: 2076-1465

Conference paper

Nelke CM, Naylor PA, Vary P, 2015, CORPUS BASED RECONSTRUCTION OF SPEECH DEGRADED BY WIND NOISE, 23rd European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 864-868, ISSN: 2076-1465

Conference paper

Hafezi S, Moore AH, Naylor PA, 2015, MODELING SOURCE DIRECTIVITY IN ROOM IMPULSE RESPONSE SIMULATION FOR SPHERICAL MICROPHONE ARRAYS, 23rd European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 574-578, ISSN: 2076-1465

Conference paper

Doire CSJ, Brookes M, Naylor PA, Betts D, Hicks CM, Dmour MA, Jensen SHet al., 2015, SINGLE-CHANNEL BLIND ESTIMATION OF REVERBERATION PARAMETERS, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 31-35, ISSN: 1520-6149

Conference paper

Zahedi A, Ostergaard J, Jensen SH, Naylor P, Bech Set al., 2015, Coding and Enhancement in Wireless Acoustic Sensor Networks, Data Compression Conference (DCC), Publisher: IEEE, Pages: 293-302, ISSN: 1068-0314

Conference paper

Hu M, Sharma D, Doclo S, Brookes M, Naylor PAet al., 2015, SPEAKER CHANGE DETECTION AND SPEAKER DIARIZATION USING SPATIAL INFORMATION, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 5743-5747, ISSN: 1520-6149

Conference paper

Eaton J, Moore AH, Naylor PA, Skoglund Jet al., 2015, DIRECT-TO-REVERBERANT RATIO ESTIMATION USING A NULL-STEERED BEAMFORMER, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 46-50, ISSN: 1520-6149

Conference paper

Sharma D, Poddar A, Manna S, Naylor PAet al., 2015, THE SAS PROJECT: SPEECH SIGNAL PROCESSING IN HIGH SCHOOL EDUCATION, 23rd European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 1781-1785, ISSN: 2076-1465

Conference paper

Doire CSJ, Brookes M, Naylor PA, Betts D, Hicks CM, Dmour MA, Jensen SHet al., 2015, SINGLE-CHANNEL BLIND ESTIMATION OF REVERBERATION PARAMETERS, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 31-35, ISSN: 1520-6149

Conference paper

Lim F, Naylor PA, Thomas MRP, Tashev IJet al., 2015, ACOUSTIC BLUR KERNEL WITH SLIDING WINDOW FOR BLIND ESTIMATION OF REVERBERATION TIME, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Publisher: IEEE, ISSN: 1931-1168

Conference paper

Hu M, Parada PP, Sharma D, Doclo S, van Waterschoot T, Brookes M, Naylor PAet al., 2015, SINGLE-CHANNEL SPEAKER DIARIZATION BASED ON SPATIAL FEATURES, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Publisher: IEEE, ISSN: 1931-1168

Conference paper

Lim F, Zhang W, Habets EAP, Naylor PAet al., 2014, Robust Multichannel Dereverberation using Relaxed Multichannel Least Squares, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, Vol: 22, Pages: 1379-1390, ISSN: 2329-9290

Journal article

Evers C, Moore AH, Naylor PA, 2014, Multiple source localisation in the spherical harmonic domain

Conference paper

Moore AH, Naylor PA, Skoglund J, An Analysis of the Effect of Larynx-Synchronous Averaging on Dereverberation of Voiced Speech, European Signal Processing Conference, ISSN: 2219-5491

Conference paper

Eaton J, Naylor PA, 2014, Detection of clipping in coded speech signals, 21st European Signal Processing Conference (EUSIPCO), Publisher: IEEE

In order to exploit the full dynamic range of communicationsand recording equipment, and to minimise the effects of noiseand interference, input gain to a recording device is typicallyset as high as possible. This often leads to the signal exceedingthe input limit of the equipment resulting in clipping. Com-munications devices typically rely on codecs such as GSM06.10to compress voice signals into lower bitrates. Althoughdetecting clipping in a hard-clipped speech signal is straight-forward due to the characteristic flattening of the peaks of thewaveform, this is not the case for speech that has subsequentlypassed through a codec. We describe a novel clipping detec-tion algorithm based on amplitude histogram analysis and leastsquares residuals which can estimate the clipped samples andthe original signal level in speech even after the clipped speechhas been perceptually coded.

Conference paper

Jarrett DP, Taseska M, Habets EAP, Naylor PAet al., 2014, Noise Reduction in the Spherical Harmonic Domain Using a Tradeoff Beamformer and Narrowband DOA Estimates, IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol: 22, Pages: 965-976

Journal article

Eaton J, Naylor PA, 2014, Noise-robust detection of peak-clipping in decoded speech, Pages: 7019-7023

Clipping is a commonplace problem in voice telecommunications and detection of clipping is useful in a range of speech processing applications. We analyse and evaluate the performance of three previously presented algorithms for clipping detection in decoded speech in high levels of ambient noise. We identify a baseline method which is well known for clipping detection, determine experimentally the optimized operation parameter for the baseline approach, and use this in our experiments. Our results indicate that the new algorithms outperform the baseline except at extreme levels of clipping and negative signal-to-noise ratios.

Conference paper

Stanton R, Gaubitch N, Naylor P, Brookes DMet al., A Differentiable Approximation to Speech Intelligibility Index with Applications to Listening Enhancement, AES Intl Conf on Audio Forensics

The Speech Intelligibility Index is a standardised objective measure for estimating the intelligibility of speech in noise. It is, however difficult to use it in the iterative optimisation of speech enhancement algorithms because it is a discontinuous function of its input parameters. In this paper, we derive an approximation for the Speech Intelligibility Index that is both continuous and differentiable, which allows for more efficient optimisation procedures. The use of the approximation is demonstrated in an application to near-end speech enhancement.

Conference paper

Antonello N, van Waterschool T, Moonen M, Naylor PAet al., 2014, SOURCE LOCALIZATION AND SIGNAL RECONSTRUCTION IN A REVERBERANT FIELD USING THE FDTD METHOD, 22nd European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 301-305, ISSN: 2076-1465

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: id=00004259&limit=30&person=true&page=4&respub-action=search.html