Publications

D'Olne E, Moore AH, Naylor PA, Donley J, Tourbabin V, Lunner Tet al., 2024, Group Conversations in Noisy Environments (GiN) - Multimedia Recordings for Location-Aware Speech Enhancement, IEEE Open Journal of Signal Processing, Vol: 5, Pages: 374-382

Recent years have seen a growing interest in the use of smart glasses mounted with microphones to solve the cocktail party problem using beamforming techniques or machine learning. Many such approaches could bring substantial advances in hearing aid or Augmented Reality (AR) research. To validate these methods, the EasyCom [Donley et al., 2021] dataset introduced high-quality multi-modal recordings of conversations in noise, including egocentric multi-channel microphone array audio, speech source pose, and headset microphone audio. While providing comprehensive data, EasyCom lacks diversity in the acoustic environments considered and the degree of overlapping speech in conversations. This work therefore presents the Group in Noise (GiN) dataset of over 2 hours of group conversations in noisy environments recorded using binaural microphones and a pair of glasses mounted with 5 microphones. The recordings took place in 3 rooms and contain 6 seated participants as well as a standing facilitator. The data also include close-talking microphone audio and head-pose data for each speaker, an audio channel from a fixed reference microphone, and automatically annotated speaker activity information. A baseline method is used to demonstrate the use of the data for speech enhancement. The dataset is publicly available in d'Olne et al. [2023].

Abstract
Cite

Journal article

Guiraud P, Moore AH, Vos RR, Naylor PA, Brookes Met al., 2023, Using a single-channel reference with the MBSTOI binaural intelligibility metric, Speech Communication, Vol: 149, Pages: 74-83, ISSN: 0167-6393

In order to assess the intelligibility of a target signal in a noisy environment, intrusive speech intelligibility metrics are typically used. They require a clean reference signal to be available which can be difficult to obtain especially for binaural metrics like the modified binaural short time objective intelligibility metric (MBSTOI). We here present a hybrid version of MBSTOI that incorporates a deep learning stage that allows the metric to be computed with only a single-channel clean reference signal. The models presented are trained on simulated data containing target speech, localised noise, diffuse noise, and reverberation. The hybrid output metrics are then compared directly to MBSTOI to assess performances. Results show the performance of our single channel reference vs MBSTOI. The outcome of this work offers a fast and flexible way to generate audio data for machine learning (ML) and highlights the potential for low level implementation of ML into existing tools.

Journal article

Guiraud P, Moore AH, Vos RR, Naylor PA, Brookes Met al., 2023, The MBSTOI Binaural Intelligibility Metric Using a Close-Talking Microphone Reference, ISSN: 1520-6149

Intelligibility metrics are a fast way to determine how comprehensible a target signal is in a noisy situation. Most metrics however rely on having a clean reference signal for computation and are not adapted to live recordings. In this paper the deep correlation modified binaural short time objective intelligibility metric (Dcor-MBSTOI) is evaluated with a single-channel close-talking microphone signal as the reference. This reference signal inevitably contains some background noise and crosstalk from non-target sources. It is found that intelligibility is overestimated when using the close-talking microphone signal directly but that this overestimation can be eliminated by applying speech enhancement to the reference signal.

Abstract
Cite

Conference paper

Hafezi S, Moore AH, Guiraud P, Naylor PA, Donley J, Tourbabin V, Lunner Tet al., 2023, Subspace Hybrid Beamforming for Head-Worn Microphone Arrays, ISSN: 1520-6149

A two-stage multi-channel speech enhancement method is proposed which consists of a novel adaptive beamformer, Hybrid Minimum Variance Distortionless Response (MVDR), Isotropic-MVDR (Iso), and a novel multi-channel spectral Principal Components Analysis (PCA) denoising. In the first stage, the Hybrid-MVDR performs multiple MVDRs using a dictionary of pre-defined noise field models and picks the minimum-power outcome, which benefits from the robustness of signal-independent beamforming and the performance of adaptive beamforming. In the second stage, the outcomes of Hybrid and Iso are jointly used in a two-channel PCA-based denoising to remove the 'musical noise' produced by Hybrid beamformer. On a dataset of real 'cocktail-party' recordings with head-worn array, the proposed method outperforms the baseline superdirective beamformer in noise suppression (fwSegSNR, SDR, SIR, SAR) and speech intelligibility (STOI) with similar speech quality (PESQ) improvement.

Abstract
Cite
Citations: 1

Conference paper

Neo VW, D'Olne E, Moore AH, Naylor PAet al., 2022, Fixed beamformer design using polynomial eigenvalue decomposition, International Workshop on Acoustic Signal Enhancement (IWAENC), Publisher: IEEE, Pages: 1-5

Array processing is widely used in many speech applications involving multiple microphones. These applications include automaticspeech recognition, robot audition, telecommunications, and hearing aids. A spatio-temporal filter for the array allows signals fromdifferent microphones to be combined desirably to improve the application performance. This paper will analyze and visually interpretthe eigenvector beamformers designed by the polynomial eigenvaluedecomposition (PEVD) algorithm, which are suited for arbitrary arrays. The proposed fixed PEVD beamformers are lightweight, withan average filter length of 114 and perform comparably to classicaldata-dependent minimum variance distortionless response (MVDR)and linearly constrained minimum variance (LCMV) beamformersfor the separation of sources closely spaced by 5 degrees.

Conference paper

Moore AH, Green T, Brookes DM, Naylor PAet al., 2022, Measuring audio-visual speech intelligibility under dynamic listening conditions using virtual reality, AES 2022 International Audio for Virtual and Augmented Reality Conference, Publisher: Audio Engineering Society (AES), Pages: 1-8

The ELOSPHERES project is a collaboration between researchers at Imperial College London and University College London which aims to improve the efficacy of hearing aids. The benefit obtained from hearing aids varies significantly between listeners and listening environments. The noisy, reverberant environments which most people find challenging bear little resemblance to the clinics in which consultations occur. In order to make progress in speech enhancement, algorithms need to be evaluated under realistic listening conditions. A key aim of ELOSPHERES is to create a virtual reality-based test environment in which alternative speech enhancement algorithms can be evaluated using a listener-in-the-loop paradigm. In this paper we present the sap-elospheres-audiovisual-test (SEAT) platform and report the results of an initial experiment in which it was used to measure the benefit of visual cues in a speech intelligibility in spatial noise task.

Conference paper

H Moore A, Hafezi S, R Vos R, A Naylor P, Brookes Met al., 2022, A compact noise covariance matrix model for MVDR beamforming, IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol: 30, Pages: 2049-2061, ISSN: 2329-9290

Acoustic beamforming is routinely used to improve the SNR of the received signal in applications such as hearing aids, robot audition, augmented reality, teleconferencing, source localisation and source tracking. The beamformer can be made adaptive by using an estimate of the time-varying noise covariance matrix in the spectral domain to determine an optimised beam pattern in each frequency bin that is specific to the acoustic environment and that can respond to temporal changes in it. However, robust estimation of the noise covariance matrix remains a challenging task especially in non-stationary acoustic environments. This paper presents a compact model of the signal covariance matrix that is defined by a small number of parameters whose values can be reliably estimated. The model leads to a robust estimate of the noise covariance matrix which can, in turn, be used to construct a beamformer. The performance of beamformers designed using this approach is evaluated for a spherical microphone array under a range of conditions using both simulated and measured room impulse responses. The proposed approach demonstrates consistent gains in intelligibility and perceptual quality metrics compared to the static and adaptive beamformers used as baselines.

Journal article

Green T, Hilkhuysen G, Huckvale M, Rosen S, Brookes M, Moore A, Naylor P, Lightburn L, Xue Wet al., 2022, Speech recognition with a hearing-aid processing scheme combining beamforming with mask-informed speech enhancement, Trends in Hearing, Vol: 26, Pages: 1-16, ISSN: 2331-2165

A signal processing approach combining beamforming with mask-informed speech enhancement was assessed by measuring sentence recognition in listeners with mild-to-moderate hearing impairment in adverse listening conditions that simulated the output of behind-the-ear hearing aids in a noisy classroom. Two types of beamforming were compared: binaural, with the two microphones of each aid treated as a single array, and bilateral, where independent left and right beamformers were derived. Binaural beamforming produces a narrower beam, maximising improvement in signal-to-noise ratio (SNR), but eliminates the spatial diversity that is preserved in bilateral beamforming. Each beamformer type was optimised for the true target position and implemented with and without additional speech enhancement in which spectral features extracted from the beamformer output were passed to a deep neural network trained to identify time-frequency regions dominated by target speech. Additional conditions comprising binaural beamforming combined with speech enhancement implemented using Wiener filtering or modulation-domain Kalman filtering were tested in normally-hearing (NH) listeners. Both beamformer types gave substantial improvements relative to no processing, with significantly greater benefit for binaural beamforming. Performance with additional mask-informed enhancement was poorer than with beamforming alone, for both beamformer types and both listener groups. In NH listeners the addition of mask-informed enhancement produced significantly poorer performance than both other forms of enhancement, neither of which differed from the beamformer alone. In summary, the additional improvement in SNR provided by binaural beamforming appeared to outweigh loss of spatial information, while speech understanding was not further improved by the mask-informed enhancement method implemented here.

Journal article

Guiraud P, Hafezi S, Naylor PA, Moore AH, Donley J, Tourbabin V, Lunner Tet al., 2022, AN INTRODUCTION TO THE SPEECH ENHANCEMENT FOR AUGMENTED REALITY (SPEAR) CHALLENGE, 17th International Workshop on Acoustic Signal Enhancement (IWAENC), Publisher: IEEE, ISSN: 2639-4316

Conference paper

Guiraud P, Moore AH, Vos RR, Naylor PA, Brookes Met al., 2022, MACHINE LEARNING FOR PARAMETER ESTIMATION IN THE MBSTOI BINAURAL INTELLIGIBILITY METRIC, 17th International Workshop on Acoustic Signal Enhancement (IWAENC), Publisher: IEEE, ISSN: 2639-4316

Conference paper

D'Olne E, Moore A, Naylor P, 2021, Model-based beamforming for wearable microphone arrays, European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 1105-1109

Beamforming techniques for hearing aid applications are often evaluated using behind-the-ear (BTE) devices. However, the growing number of wearable devices with microphones has made it possible to consider new geometries for microphone array beamforming. In this paper, we examine the effect of array location and geometry on the performance of binaural minimum power distortionless response (BMPDR) beamformers. In addition to the classical adaptive BMPDR, we evaluate the benefit of a recently-proposed method that estimates the sample covariance matrix using a compact model. Simulation results show that using a chest-mounted array reduces noise by an additional 1.3~dB compared to BTE hearing aids. The compact model method is found to yield higher predicted intelligibility than adaptive BMPDR beamforming, regardless of the array geometry.

Conference paper

Moore A, Vos R, Naylor P, Brookes Det al., 2021, Processing pipelines for efficient, physically-accurate simulation of microphone array signals in dynamic sound scenes, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing, Publisher: IEEE, ISSN: 0736-7791

Multichannel acoustic signal processing is predicated on the fact that the inter channel relationships between the received signals can be exploited to infer information about the acoustic scene. Recently there has been increasing interest in algorithms which are applicable in dynamic scenes, where the source(s) and/or microphone array may be moving. Simulating such scenes has particular challenges which are exacerbated when real-time, listener-in-the-loop evaluation of algorithms is required. This paper considers candidate pipelines for simulating the array response to a set of point/image sources in terms of their accuracy, scalability and continuity. Anew approach, in which the filter kernels are obtained using principal component analysis from time-aligned impulse responses, is proposed. When the number of filter kernels is≤40the new approach achieves more accurate simulation than competing methods.

Conference paper

Hogg A, Evers C, Moore A, Naylor Pet al., 2021, Overlapping speaker segmentation using multiple hypothesis tracking of fundamental frequency, IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol: 29, Pages: 1479-1490, ISSN: 2329-9290

This paper demonstrates how the harmonic structure of voiced speech can be exploited to segment multiple overlapping speakers in a speaker diarization task. We explore how a change in the speaker can be inferred from a change in pitch. We show that voiced harmonics can be useful in detecting when more than one speaker is talking, such as during overlapping speaker activity. A novel system is proposed to track multiple harmonics simultaneously, allowing for the determination of onsets and end-points of a speaker’s utterance in the presence of an additional active speaker. This system is bench-marked against a segmentation system from the literature that employs a bidirectional long short term memory network (BLSTM) approach and requires training. Experimental results highlight that the proposed approach outperforms the BLSTM baseline approach by 12.9% in terms of HIT rate for speaker segmentation. We also show that the estimated pitch tracks of our system can be used as features to the BLSTM to achieve further improvements of 1.21% in terms of coverage and 2.45% in terms of purity

Journal article

Hafezi S, Moore A, Naylor P, 2021, Narrowband multi-source Direction-of-Arrival estimation in the spherical harmonic domain, Journal of the Acoustical Society of America, Vol: 149, ISSN: 0001-4966

A conventional approach to wideband multi-source (MS) direction-of-arrival (DOA) estimation is to perform single source (SS) DOA estimation in time-frequency (TF) bins for which a SS assumption is valid. Such methods use the W-disjoint orthogonality (WDO) assumption due to the speech sparseness. As the number of sources increases, the chance of violating the WDO assumption increases. As shown in the challenging scenarios with multiple simultaneously active sources over a short period of time masking each other, it is possible for a strongly masked source (due to inconsistency of activity or quietness) to be rarely dominant in a TF bin. SS-based DOA estimators fail in the detection or accurate localization of masked sources in such scenarios. Two analytical approaches are proposed for narrowband DOA estimation based on the MS assumption in a bin in the spherical harmonic domain. In the first approach, eigenvalue decomposition is used to decompose a MS scenario into multiple SS scenarios, and a SS-based analytical DOA estimation is performed on each. The second approach analytically estimates two DOAs per bin assuming the presence of two active sources per bin. The evaluation validates the improvement to double accuracy and robustness to sensor noise compared to the baseline methods.

Journal article

Xue W, Moore A, Brookes D, Naylor Pet al., 2020, Speech enhancement based on modulation-domain parametric multichannel Kalman filtering, IEEE Transactions on Audio, Speech and Language Processing, Vol: 29, Pages: 393-405, ISSN: 1558-7916

Recently we presented a modulation-domain multichannel Kalman filtering (MKF) algorithm for speech enhancement, which jointly exploits the inter-frame modulation-domain temporal evolution of speech and the inter-channel spatial correlation to estimate the clean speech signal. The goal of speech enhancement is to suppress noise while keeping the speech undistorted, and a key problem is to achieve the best trade-off between speech distortion and noise reduction. In this paper, we extend the MKF by presenting a modulation-domain parametric MKF (PMKF) which includes a parameter that enables flexible control of the speech enhancement behaviour in each time-frequency (TF) bin. Based on the decomposition of the MKF cost function, a new cost function for PMKF is proposed, which uses the controlling parameter to weight the noise reduction and speech distortion terms. An optimal PMKF gain is derived using a minimum mean squared error (MMSE) criterion. We analyse the performance of the proposed MKF, and show its relationship to the speech distortion weighted multichannel Wiener filter (SDW-MWF). To evaluate the impact of the controlling parameter on speech enhancement performance, we further propose PMKF speech enhancement systems in which the controlling parameter is adaptively chosen in each TF bin. Experiments on a publicly available head-related impulse response (HRIR) database in different noisy and reverberant conditions demonstrate the effectiveness of the proposed method.

Journal article

Hafezi S, Moore AH, Naylor PA, 2019, Spatial consistency for multiple source direction-of-arrival estimation and source counting., Journal of the Acoustical Society of America, Vol: 146, Pages: 4592-4603, ISSN: 0001-4966

A conventional approach to wideband multi-source (MS) direction-of-arrival (DOA) estimation is to perform single source (SS) DOA estimation in time-frequency (TF) bins for which a SS assumption is valid. The typical SS-validity confidence metrics analyse the validity of the SS assumption over a fixed-size TF region local to the TF bin. The performance of such methods degrades as the number of simultaneously active sources increases due to the associated decrease in the size of the TF regions where the SS assumption is valid. A SS-validity confidence metric is proposed that exploits a dynamic MS assumption over relatively larger TF regions. The proposed metric first clusters the initial DOA estimates (one per TF bin) and then uses the members' spatial consistency as well as its cluster's spread to weight each TF bin. Distance-based and density-based clustering are employed as two alternative approaches for clustering DOAs. A noise-robust density-based clustering is also used in an evolutionary framework to propose a method for source counting and source direction estimation. The evaluation results based on simulations and also with real recordings show that the proposed weighting strategy significantly improves the accuracy of source counting and MS DOA estimation compared to the state-of-the-art.

Journal article

Moore AH, de Haan JM, Pedersen MS, Brookes D, Naylor PA, Jensen Jet al., 2019, Personalized signal-independent beamforming for binaural hearing aids, Journal of the Acoustical Society of America, Vol: 145, Pages: 2971-2981, ISSN: 0001-4966

The effect of personalized microphone array calibration on the performance of hearing aid beamformers under noisy reverberant conditions is studied. The study makes use of a new, publicly available, database containing acoustic transfer function measurements from 29 loudspeakers arranged on a sphere to a pair of behind-the-ear hearing aids in a listening room when worn by 27 males, 14 females, and 4 mannequins. Bilateral and binaural beamformers are designed using each participant's hearing aid head-related impulse responses (HAHRIRs). The performance of these personalized beamformers is compared to that of mismatched beamformers, where the HAHRIR used for the design does not belong to the individual for whom performance is measured. The case where the mismatched HAHRIR is that of a mannequin is of particular interest since it represents current practice in commercially available hearing aids. The benefit of personalized beamforming is assessed using an intrusive binaural speech intelligibility metric and in a matrix speech intelligibility test. For binaural beamforming, both measures demonstrate a statistically signficant (p < 0.05) benefit of personalization. The benefit varies substantially between individuals with some predicted to benefit by as much as 1.5 dB.

Journal article

Moore A, Xue W, Naylor P, Brookes Det al., 2019, Noise covariance matrix estimation for rotating microphone arrays, IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol: 27, Pages: 519-530, ISSN: 2329-9290

The noise covariance matrix computed between the signals from a microphone array is used in the design of spatial filters and beamformers with applications in noise suppression and dereverberation. This paper specifically addresses the problem of estimating the covariance matrix associated with a noise field when the array is rotating during desired source activity, as is common in head-mounted arrays. We propose a parametric model that leads to an analytical expression for the microphone signal covariance as a function of the array orientation and array manifold. An algorithm for estimating the model parameters during noise-only segments is proposed and the performance shown to be improved, rather than degraded, by array rotation. The stored model parameters can then be used to update the covariance matrix to account for the effects of any array rotation that occurs when the desired source is active. The proposed method is evaluated in terms of the Frobenius norm of the error in the estimated covariance matrix and of the noise reduction performance of a minimum variance distortionless response beamformer. In simulation experiments the proposed method achieves 18 dB lower error in the estimated noise covariance matrix than a conventional recursive averaging approach and results in noise reduction which is within 0.05 dB of an oracle beamformer using the ground truth noise covariance matrix.

Journal article

Moore A, de Haan JM, Pedersen MS, Naylor P, Brookes D, Jensen Jet al., 2019, Personalized {HRTF}s for hearing aids, ELOBES2019

Cite

Conference paper

Brookes D, Lightburn L, Moore A, Naylor P, Xue Wet al., 2019, Mask-assisted speech enhancement for binaural hearing aids, ELOBES2019

Conference paper

Xue W, Moore AH, Brookes M, Naylor PAet al., 2018, Modulation-domain parametric multichannel kalman filtering for speech enhancement, European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 2509-2513, ISSN: 2076-1465

The goal of speech enhancement is to reduce the noise signal while keeping the speech signal undistorted. Recently we developed the multichannel Kalman filtering (MKF) for speech enhancement, in which the temporal evolution of the speech signal and the spatial correlation between multichannel observations are jointly exploited to estimate the clean signal. In this paper, we extend the previous work to derive a parametric MKF (PMKF), which incorporates a controlling factor to achieve the trade-off between the speech distortion and noise reduction. The controlling factor weights between the speech distortion and noise reduction related terms in the cost function of PMKF, and based on the minimum mean squared error (MMSE) criterion, the optimal PMKF gain is derived. We analyse the performance of the proposed PMKF and show the differences with the speech distortion weighted multichannel Wiener filter (SDW-MWF). We conduct experiments in different noisy conditions to evaluate the impact of the controlling factor on the noise reduction performance, and the results demonstrate the effectiveness of the proposed method.

Conference paper

Moore AH, 2018, Multiple source direction of arrival estimation using subspace pseudointensity vectors, Publisher: arXiv

The recently proposed subspace pseudointensity method for direction ofarrival estimation is applied in the context of Tasks 1 and 2 of the LOCATAChallenge using the Eigenmike recordings. Specific implementation details aredescribed and results reported for the development dataset, for which theground truth source directions are available. For both single and multiplesource scenarios, the average absolute error angle is about 9 degrees.

Working paper

Moore AH, Lightburn L, Xue W, Naylor P, Brookes Det al., 2018, Binaural mask-informed speech enhancement for hearing aids with head tracking, International Workshop on Acoustic Signal Enhancement (IWAENC 2018), Publisher: IEEE, Pages: 461-465

An end-to-end speech enhancement system for hearing aids is pro-posed which seeks to improve the intelligibility of binaural speechin noise during head movement. The system uses a reference beam-former whose look direction is informed by knowledge of the headorientation and the a priori known direction of the desired source.From this a time-frequency mask is estimated using a deep neuralnetwork. The binaural signals are obtained using bilateral beam-formers followed by a classical minimum mean square error speechenhancer, modified to use the estimated mask as a speech presenceprobability prior. In simulated experiments, the improvement in abinaural intelligibility metric (DBSTOI) given by the proposed sys-tem relative to beamforming alone corresponds to an SNR improve-ment of 4 to 6 dB. Results also demonstrate the individual contribu-tions of incorporating the mask and the head orientation-aware beamsteering to the proposed system.

Conference paper

Moore AH, Xue W, Naylor PA, Brookes Met al., 2018, Estimation of the Noise Covariance Matrix for Rotating Sensor Arrays, 52nd Asilomar Conference on Signals, Systems, and Computers, Publisher: IEEE, Pages: 1936-1941, ISSN: 1058-6393

Conference paper

Xue W, Moore A, Brookes DM, Naylor Pet al., 2018, Modulation-domain multichannel Kalman filtering for speech enhancement, IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol: 26, Pages: 1833-1847, ISSN: 2329-9290

Compared with single-channel speech enhancement methods, multichannel methods can utilize spatial information to design optimal filters. Although some filters adaptively consider second-order signal statistics, the temporal evolution of the speech spectrum is usually neglected. By using linear prediction (LP) to model the inter-frame temporal evolution of speech, single-channel Kalman filtering (KF) based methods have been developed for speech enhancement. In this paper, we derive a multichannel KF (MKF) that jointly uses both interchannel spatial correlation and interframe temporal correlation for speech enhancement. We perform LP in the modulation domain, and by incorporating the spatial information, derive an optimal MKF gain in the short-time Fourier transform domain. We show that the proposed MKF reduces to the conventional multichannel Wiener filter if the LP information is discarded. Furthermore, we show that, under an appropriate assumption, the MKF is equivalent to a concatenation of the minimum variance distortion response beamformer and a single-channel modulation-domain KF and therefore present an alternative implementation of the MKF. Experiments conducted on a public head-related impulse response database demonstrate the effectiveness of the proposed method.

Journal article

Moore AH, Naylor P, Brookes DM, 2018, Room identification using frequency dependence of spectral decay statistics, IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Publisher: Institute of Electrical and Electronics Engineers Inc., Pages: 6902-6906, ISSN: 0736-7791

A method for room identification is proposed based on the reverberation properties of multichannel speech recordings. The approach exploits the dependence of spectral decay statistics on the reverberation time of a room. The average negative-side variance within 1/3-octave bands is proposed as the identifying feature and shown to be effective in a classification experiment. However, negative-side variance is also dependent on the direct-to-reverberant energy ratio. The resulting sensitivity to different spatial configurations of source and microphones within a room are mitigated using a novel reverberation enhancement algorithm. A classification experiment using speech convolved with measured impulse responses and contaminated with environmental noise demonstrates the effectiveness of the proposed method, achieving 79% correct identification in the most demanding condition compared to 40% using unenhanced signals.

Conference paper

Xue W, Moore A, Brookes DM, Naylor Pet al., 2018, Multichannel kalman filtering for speech ehnancement, IEEE Intl Conf on Acoustics, Speech and Signal Processing, Publisher: IEEE, ISSN: 2379-190X

The use of spatial information in multichannel speech enhancement methods is well established but information associated with the temporal evolution of speech is less commonly exploited. Speech signals can be modelled using an autoregressive process in the time-frequency modulation domain, and Kalman filtering based speech enhancement algorithms have been developed for single-channel processing. In this paper, a multichannel Kalman filter (MKF) for speech enhancement is derived that jointly considers the multichannel spatial information and the temporal correlations of speech. We model the temporal evolution of speech in the modulation domain and, by incorporating the spatial information, an optimal MKF gain is derived in the short-time Fourier transform domain. We also show that the proposed MKF becomes a conventional multichannel Wiener filter if the temporal information is discarded. Experiments using the signals generated from a public head-related impulse response database demonstrate the effectiveness of the proposed method in comparison to other techniques.

Conference paper

Yiallourides C, Moore AH, Auvinet E, Van der Straeten C, Naylor PAet al., 2018, Acoustic Analysis and Assessment of the Knee in Osteoarthritis During Walking, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 281-285

We examine the relation between the sounds emitted by the knee joint during walking and its condition, with particular focus on osteoarthritis, and investigate their potential for noninvasive detection of knee pathology. We present a comparative analysis of several features and evaluate their discriminant power for the task of normal-abnormal signal classification. We statistically evaluate the feature distributions using the two-sample Kolmogorov-Smirnov test and the Bhattacharyya distance. We propose the use of 11 statistics to describe the distributions and test with several classifiers. In our experiments with 249 normal and 297 abnormal acoustic signals from 40 knees, a Support Vector Machine with linear kernel gave the best results with an error rate of 13.9%.

Conference paper

Hafezi S, Moore AH, Naylor PA, 2018, ROBUST SOURCE COUNTING AND ACOUSTIC DOA ESTIMATION USING DENSITY-BASED CLUSTERING, 10th IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM), Publisher: IEEE, Pages: 395-399, ISSN: 1551-2282

Conference paper

D'Amore L, Arcucci R, Li Y, Montella R, Moore A, Phillipson L, Toumi Ret al., 2018, Performance Assessment of the Incremental Strong Constraints 4DVAR Algorithm in ROMS, 12th International Conference on Parallel Processing and Applied Mathematics (PPAM), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: 48-57, ISSN: 0302-9743

Conference paper

DrAlastairMoore

Contact

Location

Summary