Imperial College London

Patrick A. Naylor

Faculty of EngineeringDepartment of Electrical and Electronic Engineering

Professor of Speech & Acoustic Signal Processing
 
 
 
//

Contact

 

+44 (0)20 7594 6235p.naylor Website

 
 
//

Location

 

803Electrical EngineeringSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

443 results found

Jarrett DP, Habets EAP, Naylor PA, 2013, Spherical harmonic domain noise reduction using an MVDR beamformer and DOA-based second-order statistics estimation, Proc. of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013)

Conference paper

Eaton J, Gaubitch ND, Naylor PA, 2013, Noise-robust reverberation time estimation using spectral decay distributions with reduced computational cost, Pages: 161-165, ISSN: 1520-6149

Reverberation Time (T60) is an important measure of the acoustic properties of a room. It can provide information about the acoustic environment, the intelligibility, and quality of speech recorded in the room, and help improve the performance of speech processing algorithms with reverberant speech. Where the acoustic impulse response of the room is not available, the T60 must be estimated non-intrusively from reverberant speech. State-of-the-art non-intrusive T60 estimators have been shown to be strongly biased in the presence of noise. We describe a novel T60 estimation algorithm based on spectral decay distributions that provides robustness to additive noise for a range of realistic noise types for signal-to-noise ratios in the range 0 to 35 dB and T60s between 200 and 950 ms. The proposed method also has much reduced computational cost.

Conference paper

Lim F, Naylor PA, 2013, ROBUST SPEECH DEREVERBERATION USING SUBBAND MULTICHANNEL LEAST SQUARES WITH VARIABLE RELAXATION, 21st European Signal Processing Conference (EUSIPCO), Publisher: IEEE

Conference paper

Lim F, Naylor PA, 2013, ROBUST LOW-COMPLEXITY MULTICHANNEL EQUALIZATION FOR DEREVERBERATION, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 689-693, ISSN: 1520-6149

Conference paper

Sharma D, Naylor PA, Brookes M, 2013, NON-INTRUSIVE SPEECH INTELLIGIBILITY ASSESSMENT, 21st European Signal Processing Conference (EUSIPCO), Publisher: IEEE

Conference paper

Lim F, Thomas MRP, Naylor PA, 2013, MINTFORMER: A SPATIALLY AWARE CHANNEL EQUALIZER, 14th IEEE Workshop on Applications of Signal Processing to AudNew Paltzio and Acoustics (WASPAA), Publisher: IEEE, ISSN: 1931-1168

Conference paper

Jarrett DP, Thiergart O, Habets EAP, Naylor PAet al., 2012, Coherence-based diffuseness estimation in the spherical harmonic domain, Proc. of the IEEE Convention of Electrical and Electronics Engineers in Israel (IEEEI)

Conference paper

Jarrett DP, Habets EAP, Benesty J, Naylor PAet al., 2012, A tradeoff beamformer for noise reduction in the spherical harmonic domain, Proc. of the International Workshop on Acoustic Signal Enhancement (IWAENC 2012)

Conference paper

Annibale P, Filos J, Naylor PA, Rabenstein Ret al., 2012, Geometric inference of the room geometry under temperature variations

Geometric inference is an approach for localizing reflectors in a closed acoustic space. It is based on a simple observation that turns time differences of arrival (TDOA) or time of arrival (TOA) measurements from the signals of a microphone array into a geometric constraint. The reflector localization methodology relies on accurate TDOA which is directly dependent on speed of sound information. Estimating the actual speed of sound at the ambient temperature therefore greatly improves the accuracy of the reflector localization in uncontrolled environments. This manuscript shows how to use the geometric inference jointly with the speed of sound estimation for a more accurate reflector localization. Simulations and experiments show the validity of the proposed approach. © 2012 IEEE.

Conference paper

Drugman T, Thomas MRP, Gudnason J, Naylor PA, Dutoit Tet al., 2012, Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review, IEEE Trans. Audio Speech Language Proc., Vol: 20, Pages: 994-1006

Journal article

Habets EAP, Benesty J, Naylor PA, 2012, Speech Distortion and Interference Rejection Constraint Beamformer, IEEE Trans. Audio Speech Language Proc., Vol: 20, Pages: 854-867

Journal article

Lin XS, Khong AWH, Naylor PA, 2012, A Forced Spectral Diversity Algorithm For Speech Dereverberation In The Presence Of Near-common Zeros, IEEE Trans. Audio Speech Language Proc., Vol: 20, Pages: 888-899

Journal article

Canclini A, Antonacci F, Filos J, Sarti A, Naylor Pet al., 2012, Exact localization of planar acoustic reflectors in three-dimensional geometries

In this paper we propose a methodology for localizing acoustic planar reflectors in a 3D geometry, using acoustic measurements acquired by a set of microphones. An acoustic source emitting a known signal is placed close to the wall to be identified, and is used for estimating the source-to-microphone impulse responses. In a preliminary step, such estimates are employed for localizing the source. After that, the Times Of Arrival (TOAs) associated to the first order reflective paths are extracted from the impulse responses and converted into quadratic constraints (ellipsoids) acting on the reflective plane. The constraints are then collected into acost function, whose exact minimization leads to the searched plane. A theoretical analysis is performed for predicting the impact of measurement errors on the estimation. Moreover, experimental results in a real meeting room prove the effectiveness of the method.

Conference paper

Gaubitch ND, Löllmann HW, Jeub M, Falk TH, Naylor PA, Vary P, Brookes Met al., 2012, Performance comparison of algorithms for blind reverberation time estimation from speech

The reverberation time, T60, is one of the key parameters used to quantify room acoustics. It can provide information about the quality and intelligibility of speech recorded in a reverberant environment, and it can be used to increase robustness to reverberation of speech processing algorithms. T60 can be determined directly from a measurement of the acoustic impulse response, but in situations where this is unavailable it must be estimated blindly from reverberant speech. In this contribution, we provide a study of three state-of-the-art methods for blind T60 estimation. Experimental results with a large number of talkers, simulated and measured acoustic impulse responses, and various levels of additive white Gaussian noise are presented. The relative merits of the three methods in terms of computational time, estimation accuracy, noise sensitivity and inter-talker variance are discussed. In general, all three methods are able to estimate the reverberation time to within 0.2 s for T60 ≤ 0.8 s and SNR ≥ 30 dB, while increasing the noise level causes overestimation. The relative computational speed of the three methods is also assessed.

Conference paper

Lim F, Naylor PA, 2012, Relaxed multichannel least squares with constrained initial taps for multichannel dereverberation

This paper presents a novel algorithm for robust multichannel dereverberation in the presence of system identification errors with the specific aim of avoiding colouration of the equalized signal. Our proposed algorithm is based upon the technique of channel shortening, which targets only the late taps of the room impulse response. Within the framework of the relaxed multichannel least squares (RMCLS) algorithm, we employ partial relaxation of the early taps of the equalized impulse response (EIR) to increase robustness to channel estimation errors, while constraining the initial taps to avoid undesirable colouration of the equalized signal. It is shown through quantitative experimental results that the resultant equalized signal has an overall improved speech quality perception when compared to alternative algorithms.

Conference paper

Naylor PA, Gaubitch ND, 2012, Acoustic signal processing in noise: It's not getting any quieter

Acoustic signal processing research has been addressing the issues associated with additive noise and other degradations in speech for many years and several significant technical advances are now embedded in the state-of-the-art. Nevertheless, the problems are not solved and may actually be worsening. The philosophy advocated in this paper is that further improvements in acoustic signal processing for noise reduction and robustness are, of course, important but are unlikely to be sufficient on their own. Alongside the signal processing, successful systems are likely going to need to include two further factors: an element of matching to the human perception system and also an element of sensing and adaptation to the local environment, giving systems acoustic awareness. Examples of current research on human perception and acoustic signal processing are discussed. These include some aspects of auditory cognition and signal processing methods for building acoustic awareness. A new initiative for benchmarking is also highlighted.

Conference paper

Sharma D, Naylor PA, Gaubitch ND, Brookes Met al., 2012, NON INTRUSIVE CODEC IDENTIFICATION ALGORITHM, IEEE International Conference on Acoustics, Speech and Signal Processing, Publisher: IEEE, Pages: 4477-4480, ISSN: 1520-6149

Conference paper

Annibale P, Filos J, Naylor PA, Rabenstein Ret al., 2012, TDOA-based speed of sound estimation for air temperature and room geometry inference, IEEE Trans. Audio, Speech, Lang. Process.

Journal article

Jarrett D, Habets EAP, Thomas M, Naylor PAet al., 2012, Rigid sphere room impulse response simulation: algorithm and applications, J. Acoust. Soc. America, Vol: 132

Journal article

Sharma D, Hilkhuysen G, Naylor PA, Gaubitch ND, Huckvale M, Brookes Met al., 2012, Descriptive Vocabulary Development for Degraded Speech, 13th Annual Conference of the International-Speech-Communication-Association, Publisher: ISCA-INT SPEECH COMMUNICATION ASSOC, Pages: 1494-1497

Conference paper

Antonacci F, Filos J, Thomas M, Habets EAP, Sarti A, Naylor PAet al., 2012, Inference of room geometry from acoustic impulse responses, IEEE Trans. Audio Speech Language Proc., Vol: 20, Pages: 2683-2695

Journal article

Thomas MRP, Gaubitch ND, Habets EAP, Naylor PAet al., 2012, AN INSIGHT INTO COMMON FILTERING IN NOISY SIMO BLIND SYSTEM IDENTIFICATION, IEEE International Conference on Acoustics, Speech and Signal Processing, Publisher: IEEE, Pages: 521-524, ISSN: 1520-6149

Conference paper

Filos J, Canclini A, Antonacci F, Sarti A, Naylor PAet al., 2012, LOCALIZATION OF PLANAR ACOUSTIC REFLECTORS FROM THE COMBINATION OF LINEAR ESTIMATES, 20th European Signal Processing Conference (EUSIPCO), Publisher: IEEE COMPUTER SOC, Pages: 1019-1023, ISSN: 2076-1465

Conference paper

Annibale P, Antonacci F, Bestagini P, Brutti A, Canclini A, Cristoforetti L, Habets EAP, Filos J, Kellermann W, Kowalczyk K, Lombard A, Mabande E, Markovic D, Naylor PA, Omologo M, Rabenstein R, Sarti A, Svaizer P, Thomas MRPet al., 2011, The SCENIC Project: Space-Time Audio Processing for Environment-Aware Acoustic Sensing and Rendering

Conference paper

Slaney M, Naylor PA, 2011, Audio and Acoustic Signal Processing, IEEE SIGNAL PROCESSING MAGAZINE, Vol: 28, Pages: 160-U26, ISSN: 1053-5888

Journal article

Jarrett DP, Thomas MR, Habets EAP, Naylor PAet al., 2011, Simulating Room Impulse Responses for Spherical Microphone Arrays

Conference paper

Loganathan P, Habets EAP, Naylor PA, 2011, A Proportionate Adaptive Algorithm with Variable Partitioned Block Length for Acoustic Echo Cancellation

Conference paper

Gaubitch ND, Brookes M, Naylor PA, Sharma Det al., 2011, Bayesian Adaptive method for estimating Speech Intelligibility in noise, Pages: 169-174

We present the Bayesian Adaptive Speech Intelligibility Estimation (BASIE) method - a tool for rapid estimation of a given speech reception threshold (SRT) and the slope at that threshold of multiple psychometric functions for speech intelligibility in noise. The core of this tool is an adaptive Bayesian procedure, which adjusts the signal-to-noise ratio at each subsequent stimulus such that the expected variance of the threshold and slope estimates are minimised. Simulation results show that the algorithm is able to achieve SRT estimates accurate to within ±1 dB in under 30 iterations. Furthermore, we discuss strategies for using BASIE to evaluate the effects of speech processing algorithms on intelligibility and we give two illustrative examples for different noise reduction methods with supporting listening experiments.

Conference paper

Sharma D, Hilkhuysen G, Gaubitch ND, Brookes M, Naylor PAet al., 2011, C-Qual - A validation of PESQ using degradations encountered in forensic and law enforcement audio, Pages: 177-181

Assessment of speech quality of law-enforcement audio recordings is important as degradations introduced by non-ideal recording conditions can reduce the intelligence value of such recordings. Furthermore a model that predicts speech quality could be beneficial for assessing the performance of audio collection and enhancement systems. The Perceptual Evaluation of Speech Quality (PESQ) algorithm (ITU-T P.862) has been validated for degradations common in telecommunications. In this paper we apply PESQ to degradations typically encountered in law-enforcement. Also we present a subjectively labeled database (C-Qual) containing distortions encountered in law enforcement scenarios. Comparing the prediction by PESQ and the observed opinions provided by the listeners shows that PESQ is less suitable for estimating the speech quality in this context.

Conference paper

Jarrett DP, Habets EAP, Thomas MRP, Gaubitch ND, Naylor PAet al., 2011, Dereverberation performance of rigid and open spherical microphone arrays: theory & simulation

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: id=00004259&limit=30&person=true&page=8&respub-action=search.html