Publications

Conference paper

Eaton J, Gaubitch ND, Moore AH, Naylor PAet al., 2015,

The ACE Challenge - corpus description and performance evaluation

, WASPAA, Publisher: IEEE

Knowledge of the Direct-to-Reverberant Ratio (DRR) and Reverberation Time (T60) can be used to better perform speech and audio processing such as dereverberation. Established methods compute these parameters from measured Acoustic Impulse Responses (AIRs). However, in many practical situations the AIR is not available and the parameters must be estimated non-intrusively directly from noisy speech or audio signals. The Acoustic Characterization of Environments (ACE) Challenge is a competition to identify the most promising non-intrusive DRR and T60 estimation methods using real noisy reverberant speech. We describe the ACE corpus comprising multi-channel AIRs, and multi-channel noise including ambient, fan and babble noise recorded in the same environment as the measured AIRs, along with the corresponding DRR and T60 measurements. The evaluation methodology is discussed and comparative results are shown.

Conference paper

Eaton J, Naylor PA, 2015,

Reverberation time estimation on the ACE corpus using the SDD method

, arXiv, ACE Challenge Workshop, a satellite event of IEEE-WASPAA, Publisher: arXiv

Reverberation Time ($T_60$) is an important measure for characterizing the properties of a room. The author’s $T_60$ estimation algorithm was previously tested on simulated data where the noise is artificially added to the speech after convolution with a impulse responses simulated using the image method. We test the algorithm on speech convolved with real recorded impulse responses and noise from the same rooms from the Acoustic Characterization of Environments (ACE) corpus and achieve results comparable results to those using simulated data.

Conference paper

Doire C, Brookes D, Naylor P, Jensen SHet al., 2015,

Data-Driven Statistical Modelling of Room Impulse Responses in the Power Domain

, European Signal Processing Conference (EUSIPCO), Publisher: IEEE

Having an accurate statistical model of room impulse responses with a minimum number of parameters is of crucial importance in applications such as dereverberation. In this paper, by taking into account the behaviour of the early reflections, we extend the widely-used statistical model proposed by Polack. The squared room impulse response is modelled in each frequency band as the realisation of a stochastic process weighted by the sum of two exponential decays. Room-independent values for the new parameters involved are obtained through analysis of several room impulse response databases, and validation of the model in the likelihood sense is performed.

Conference paper

Moore AH, Evers C, Naylor PA, Alon DL, Rafaely Bet al., 2015,

Direction of arrival estimation using pseudo-intensity vectors with direct-path dominance test

, European Signal Processing Conference, Publisher: IEEE, Pages: 2296-2300, ISSN: 2219-5491

The accuracy of direction of arrival estimation tends to degrade under reverberant conditions due to the presence of reflected signal components which are correlated with the direct path. The recently proposed direct-path dominance test provides a means of identifying time-frequency regions in which a single signal path is dominant. By analysing only these regions it was shown that the accuracy of the FS-MUSIC algorithm could be significantly improved. However, for real-time implementation a less computationally demanding localisation algorithm would be preferable. In the present contribution we investigate the direct-path dominance test as a preprocessing step to pseudo-intensity vector-based localisation. A novel formulation of the pseudo-intensity vector is proposed which further exploits the direct path dominance test and leads to improved localisation performance.

Conference paper

Hu M, Doclo S, Sharma D, Brookes D, Naylor Pet al., 2015,

Noise Robust Blind System Identification Algorithms Based On A Rayleigh Quotient Cost Function

, European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 2476-2480

An important prerequisite for acoustic multi-channel equalization for speech dereverberation involves the identification of the acoustic channels between the source and the microphones. Blind System Identification (BSI) algorithms based on cross-relation error minimization are known to mis-converge in the presence of noise. Although algorithms have been proposed in the literature to improve robustness to noise, the estimated room impulse responses are usually constrained to have a flat magnitude spectrum. In this paper, noise robust algorithms based on a Rayleigh quotient cost function are proposed. Unlike the traditional algorithms, the estimated impulse responses are not always forced to have unit norm. Experimental results using simulated room impulse responses and several SNRs show that one of the proposed algorithms outperforms competing algorithms in terms of normalized projection misalignment.

Conference paper

Evers C, Moore AH, Naylor PA, Sheaffer J, Rafaely Bet al., 2015,

Bearing-only acoustic tracking of moving speakers for robot audition

, 2015 IEEE International Conference on Digital Signal Processing (DSP), Publisher: IEEE, Pages: 1206-1210, ISSN: 1546-1874

This paper focuses on speaker tracking in robot audition for human-robot interaction. Using only acoustic signals, speaker tracking in enclosed spaces is subject to missing detections and spurious clutter measurements due to speech inactivity, reverberation and interference. Furthermore, many acoustic localization approaches estimate speaker direction, hence providing bearing-only measurements without range information. This paper presents a probability hypothesis density (PHD) tracker that augments the bearing-only speaker directions of arrival with a cloud of range hypotheses at speaker initiation and propagates the random variates through time. Furthermore, due to their formulation PHD filters explicitly model, and hence provide robustness against, clutter and missing detections. The approach is verified using experimental results.

Conference paper

Moore AH, Evers C, Naylor PA, 2015,

Multichannel equalisation for high-order spherical microphone arrays using beamformed channels

, 2015 IEEE International Conference on Digital Signal Processing (DSP), Publisher: IEEE, Pages: 1211-1215, ISSN: 1546-1874

High-order spherical microphone arrays offer many practical benefits including relatively fine spatial resolution in all directions and rotation invariant processing using eigenbeams. Spatial filtering can reduce interference from noise and reverberation but in even moderately reverberant environments the beam pattern fails to suppress reverberation to a level adequate for typical applications. In this paper we investigate the feasibility of applying dereverberation by considering multiple beamformer outputs as channels to be dereverberated. In one realisation we process directly in the spherical harmonic domain where the beampatterns are mutually orthogonal. In a second realisation, which is not limited to spherical microphone arrays, beams are pointed in the direction of dominant reflections. Simulations demonstrate that in both cases reverberation is significantly reduced and, in the best case, clarity index is improved by 15 dB.

Journal article

Parada PP, Sharma D, Naylor PA, van Waterschoot Tet al., 2015,

Reverberant speech recognition exploiting clarity index estimation

, Eurasip Journal on Advances in Signal Processing, Vol: 2016, Pages: 1-12, ISSN: 1687-6180

We present single-channel approaches to robust automatic speech recognition (ASR) in reverberant environments based on non-intrusive estimation of the clarity index (C 50). Our best performing method includes the estimated value of C 50 in the ASR feature vector and also uses C 50 to select the most suitable ASR acoustic model according to the reverberation level. We evaluate our method on the REVERB Challenge database employing two different C 50 estimators and show that our method outperforms the best baseline of the challenge achieved without unsupervised acoustic model adaptation, i.e. using multi-condition hidden Markov models (HMMs). Our approach achieves a 22.4 % relative word error rate reduction in comparison to the best baseline of the challenge.

Conference paper

Eaton J, Moore AH, Naylor PA, Skoglund Jet al., 2015,

Direct-to-reverberant ratio estimation using a null-steered beamformer

, ICASSP, Publisher: IEEE, Pages: 46-50

Reverberation affects the quality and intelligibility of distant speech recorded in a room. Direct-to-Reverberant Ratio (DRR) is a useful measure for assessing the acoustic configuration and can be used to inform dereverberation algorithms. We describe a novel DRR estimation algorithm applicable where the signal was recorded with two or more microphones, such as mobile communications devices and laptops. The method uses a null-steered beamformer. In simulations the proposed method yields accurate DRR estimates to within +/- 4 dB across a across a wide variety of room sizes, reverberation times and source-receiver distances. It is also shown that the proposed method is more robust to background noise than a baseline approach. The best estimation accuracy is obtained in the region from -5 to 5 dB which is a relevant range for portable devices.

Journal article

Zahedi A, Ostergaard J, Jensen SH, Bech S, Naylor Pet al., 2015,

Audio coding in wireless acoustic sensor networks

, SIGNAL PROCESSING, Vol: 107, Pages: 141-152, ISSN: 0165-1684

Author Web Link
Cite
Citations: 11

Conference paper

Hu M, sharma D, Doclo S, Brookes D, naylor Pet al., 2015,

Speaker change detection and speaker diarization using spatial information

, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Cite

Conference paper

Doire CSJ, Brookes M, Naylor PA, Betts D, Hicks CM, Dmour MA, Jensen SHet al., 2015,

SINGLE-CHANNEL BLIND ESTIMATION OF REVERBERATION PARAMETERS

, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 31-35, ISSN: 1520-6149

Conference paper

Antonello N, van Waterschoot T, Moonen M, Naylor PAet al., 2015,

Evaluation of a numerical method for identifying surface acoustic impedances in a reverberant room

, Pages: 185-190

Wave-based room acoustic simulations are becoming more popular as the available compute power continues to increase. The definition of boundary conditions and acoustic impedances is of fundamental importance for these simulations to succeed in representing a realistic acoustical space. Acoustic impedance databases exist in terms of absorption coefficients, which are usually measured in reverberation chambers. In this type of measurements, the sound field is assumed to be diffuse, a condition which is not met in most rooms. In particular at low frequencies, where wave-based simulations are possible, a different approach is sought as an alternative to acoustic impedance measurements. This paper focuses on a recently proposed method for estimating surface acoustic impedances. This method is based on the use of a numerical room model, and does not require the assumption of a diffuse field. Assuming that the geometry of the room is known, a finite difference time domain (FDTD) simulation is matched with measured data by solving an optimization problem. The set-up for such a measurement method consists only of a set of microphones and a loudspeaker. This could be applied in every room, removing the need for expensive facilities such as reverberation chambers. The solution of the optimization problem leads to the sought parameters of the acoustic surface impedances. In this paper the adjoint method is used for the computation of the derivative in the optimization problem. This method enables a large number of decision variables in the optimization problem making it possible to account for inhomogeneities of the surface acoustic impedance and hence to avoid the need to specify the different acoustic impedance surfaces beforehand.

Abstract
Cite
Citations: 2

Conference paper

Doire CSJ, Brookes M, Naylor PA, Betts D, Hicks CM, Dmour MA, Jensen SHet al., 2015,

SINGLE-CHANNEL BLIND ESTIMATION OF REVERBERATION PARAMETERS

, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 31-35, ISSN: 1520-6149

Author Web Link
Cite
Citations: 2

Conference paper

Eaton J, Moore AH, Naylor PA, Skoglund Jet al., 2015,

DIRECT-TO-REVERBERANT RATIO ESTIMATION USING A NULL-STEERED BEAMFORMER

, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 46-50, ISSN: 1520-6149

Author Web Link
Cite
Citations: 13

Conference paper

Hu M, Sharma D, Doclo S, Brookes M, Naylor PAet al., 2015,

SPEAKER CHANGE DETECTION AND SPEAKER DIARIZATION USING SPATIAL INFORMATION

, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 5743-5747, ISSN: 1520-6149

Author Web Link
Cite
Citations: 6

Conference paper

Zahedi A, Ostergaard J, Jensen SH, Naylor P, Bech Set al., 2015,

Coding and Enhancement in Wireless Acoustic Sensor Networks

, Data Compression Conference (DCC), Publisher: IEEE, Pages: 293-302, ISSN: 1068-0314

Author Web Link
Cite
Citations: 4

Conference paper

Hafezi S, Moore AH, Naylor PA, 2015,

MODELING SOURCE DIRECTIVITY IN ROOM IMPULSE RESPONSE SIMULATION FOR SPHERICAL MICROPHONE ARRAYS

, 23rd European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 574-578, ISSN: 2076-1465

Author Web Link
Cite
Citations: 1

Conference paper

Nelke CM, Naylor PA, Vary P, 2015,

CORPUS BASED RECONSTRUCTION OF SPEECH DEGRADED BY WIND NOISE

, 23rd European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 864-868, ISSN: 2076-1465

Author Web Link
Cite
Citations: 1

Conference paper

Javed HA, Naylor PA, 2015,

AN EXTENDED REVERBERATION DECAY TAIL METRIC AS A MEASURE OF PERCEIVED LATE REVERBERATION

, 23rd European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 1063-1067, ISSN: 2076-1465

Imperial College London

Latest News

Speech and Audio Processing

The ACE Challenge - corpus description and performance evaluation

Reverberation time estimation on the ACE corpus using the SDD method

Data-Driven Statistical Modelling of Room Impulse Responses in the Power Domain

Direction of arrival estimation using pseudo-intensity vectors with direct-path dominance test

Noise Robust Blind System Identification Algorithms Based On A Rayleigh Quotient Cost Function

Bearing-only acoustic tracking of moving speakers for robot audition

Multichannel equalisation for high-order spherical microphone arrays using beamformed channels

Reverberant speech recognition exploiting clarity index estimation

Direct-to-reverberant ratio estimation using a null-steered beamformer

Audio coding in wireless acoustic sensor networks

Speaker change detection and speaker diarization using spatial information

SINGLE-CHANNEL BLIND ESTIMATION OF REVERBERATION PARAMETERS

Evaluation of a numerical method for identifying surface acoustic impedances in a reverberant room

SINGLE-CHANNEL BLIND ESTIMATION OF REVERBERATION PARAMETERS

DIRECT-TO-REVERBERANT RATIO ESTIMATION USING A NULL-STEERED BEAMFORMER

SPEAKER CHANGE DETECTION AND SPEAKER DIARIZATION USING SPATIAL INFORMATION

Coding and Enhancement in Wireless Acoustic Sensor Networks

MODELING SOURCE DIRECTIVITY IN ROOM IMPULSE RESPONSE SIMULATION FOR SPHERICAL MICROPHONE ARRAYS

CORPUS BASED RECONSTRUCTION OF SPEECH DEGRADED BY WIND NOISE

AN EXTENDED REVERBERATION DECAY TAIL METRIC AS A MEASURE OF PERCEIVED LATE REVERBERATION

Publications

Search or filter publications

Filter by type:

Filter by year:

Results

Search results

Reverberation time estimation on the ACE corpus using the SDD method

SINGLE-CHANNEL BLIND ESTIMATION OF REVERBERATION PARAMETERS

Evaluation of a numerical method for identifying surface acoustic impedances in a reverberant room

SINGLE-CHANNEL BLIND ESTIMATION OF REVERBERATION PARAMETERS

DIRECT-TO-REVERBERANT RATIO ESTIMATION USING A NULL-STEERED BEAMFORMER

SPEAKER CHANGE DETECTION AND SPEAKER DIARIZATION USING SPATIAL INFORMATION

MODELING SOURCE DIRECTIVITY IN ROOM IMPULSE RESPONSE SIMULATION FOR SPHERICAL MICROPHONE ARRAYS

CORPUS BASED RECONSTRUCTION OF SPEECH DEGRADED BY WIND NOISE

AN EXTENDED REVERBERATION DECAY TAIL METRIC AS A MEASURE OF PERCEIVED LATE REVERBERATION