Search or filter publications

Filter by type:

Filter by publication type

Filter by year:

to

Results

  • Showing results for:
  • Reset all filters

Search results

  • Book chapter
    Jarrett DP, Habets EAP, Naylor PA, 2017,

    Introduction

    , Springer Topics in Signal Processing, Pages: 1-10

    The motivation behind this book lies in the rapidly growing interest in spherical microphone arrays over the last decade. Important applications for these arrays include human-human and human-machine speech communication systems and spatial sound recording. While human-human speech communication systems have a long history, speech also plays an ever-growing part in human-machine communication. This trend has been fuelled by advances in speech recognition technology, as well as the explosion in available computing power, particularly on mobile devices. With the widespread availability of 3D sound cinema systems and virtual reality gear with 3D binaural sound reproduction, the need to capture spatial sound is rapidly growing. Spherical microphone arrays are particularly suitable for capturing all three dimensions of the sound field, including both ambient sounds and sounds from particular directions. In this chapter, we introduce the topic of acoustic signal processing using microphone arrays, and then explore spherical microphone arrays in more detail. We provide an outline of the structure of the book, and discuss the relationships between each of the subsequent chapters.

  • Book chapter
    Jarrett DP, Habets EAP, Naylor PA, 2017,

    Spherical array acoustic impulse response simulation

    , Springer Topics in Signal Processing, Pages: 39-64

    In order to evaluate spherical array processing algorithms comprehensively under many different acoustic conditions, it is indispensable to use simulated acoustic impulse responses (AIRs) to characterize the source–microphone acoustic channel, most typically in a room or other enclosed acoustic environment. The image method proposed by Allen and Berkley is a well-established way of doing this for point-to-point AIRs with sensors in free space. However, it does not account for the acoustic scattering introduced by a rigid sphere. In this chapter, we present a method for simulating the AIRs between a sound source and microphones positioned on a rigid spherical array. In addition, three examples are presented based on this method: an analysis of a diffuse reverberant sound field, a study of binaural cues in the presence of reverberation, and an illustration of the algorithm’s use as a mouth simulator.

  • Book chapter
    Jarrett DP, Habets EAP, Naylor PA, 2017,

    Signal-dependent array processing

    , Springer Topics in Signal Processing, Pages: 113-139

    In this chapter, we derive spherical harmonic domain signal-dependent beamformers, whose weights depend on the second-order statistics of the desired signal and/or of the noise to be suppressed. These beamformers adaptively seek to achieve optimal performance in terms of noise reduction and speech distortion.

  • Book chapter
    Jarrett DP, Habets EAP, Naylor PA, 2017,

    Signal-independent array processing

    , Springer Topics in Signal Processing, Pages: 93-112

    The process of combining signals acquired by a microphone array in order to ‘focus’ on a signal in a specific direction is known as beamforming or spatial filtering. This chapter considers signal-independent (fixed) beamformers, controlled by weights only dependent on the direction of arrival of the source to be extracted, and which do not otherwise depend on the desired signal. Because the weights of these beamformers are given by simple expressions, they present the advantages of being straightforward to implement and of having low computational complexity.

  • Book chapter
    Jarrett DP, Habets EAP, Naylor PA, 2017,

    Parametric array processing

    , Springer Topics in Signal Processing, Pages: 141-150

    This chapter takes a different approach to signal enhancement using spherical microphone arrays: a physically-motivated parametric representation of the sound field is introduced. It is shown that the sound field can be manipulated to achieve noise reduction or dereverberation by applying a time- and frequency-dependent gain to a reference signal. The gain is a simple function of the sound field parameters, which can be estimated using the methods presented in Chap. 5.

  • Journal article
    Moore AH, Peso P, Naylor PA, 2016,

    Speech enhancement for robust automatic speech recognition: Evaluation using a baseline system and instrumental measures

    , Computer Speech and Language, Vol: 46, Pages: 574-584, ISSN: 1095-8363

    Automatic speech recognition in everyday environments must be robust to significant levels of reverberation andnoise. One strategy to achieve such robustness is multi-microphone speech enhancement. In this study, we presentresults of an evaluation of different speech enhancement pipelines using a state-of-the-artASRsystem for a widerange of reverberation and noise conditions. The evaluation exploits the recently released ACE Challenge databasewhich includes measured multichannel acoustic impulse responses from 7 different rooms with reverberation timesranging from 0.33 s to 1.34 s. The reverberant speech is mixed with ambient, fan and babble noise recordings madewith the same microphone setups in each of the rooms. In the first experiment performance of theASRwithoutspeech processing is evaluated. Results clearly indicate the deleterious effect of both noise and reverberation. In thesecond experiment, different speech enhancement pipelines are evaluated with relative word error rate reductions ofup to 82%. Finally, the ability of selected instrumental metrics to predictASRperformance improvement is assessed.The best performing metric, Short-Time Objective Intelligibility Measure, is shown to have a Pearson correlationcoefficient of 0.79, suggesting that it is a useful predictor of algorithm performance in these tests.

  • Conference paper
    Moore AH, Evers C, Naylor PA, 2016,

    2D direction of arrival estimation of multiple moving sources using a spherical microphone array

    , European Signal Processing Conference, Publisher: IEEE, ISSN: 2219-5491

    Direction of arrival estimation using a spherical microphonearray is an important and growing research area. One promisingalgorithm is the recently proposed Subspace PseudoIntensityVector method. In this contribution the SubspacePseudo-Intensity Vector method is combined with a state-ofthe-artmethod for robustly estimating the centres of mass in a2D histogram based on matching pursuits. The performanceof the improved Subspace Pseudo-Intensity Vector method isevaluated in the context of localising multiple moving sourceswhere it is shown to outperform competing methods in termsof clutter rate and the number of missed detections whilstremaining comparable in terms of localisation accuracy.

  • Conference paper
    Evers C, Moore A, Naylor P, 2016,

    Localization of Moving Microphone Arrays from Moving Sound Sources for Robot Audition

    , European Signal Processing Conference (EUSIPCO), Publisher: IEEE, ISSN: 2076-1465

    Acoustic Simultaneous Localization and Mapping(a-SLAM) jointly localizes the trajectory of a microphone arrayinstalled on a moving platform, whilst estimating the acousticmap of surrounding sound sources, such as human speakers.Whilst traditional approaches for SLAM in the vision and opticalresearch literature rely on the assumption that the surroundingmap features are static, in the acoustic case the positions oftalkers are usually time-varying due to head rotations and bodymovements. This paper demonstrates that tracking of movingsources can be incorporated in a-SLAM by modelling the acousticmap as a Random Finite Set (RFS) of multiple sources andexplicitly imposing models of the source dynamics. The proposedapproach is verified and its performance evaluated for realisticsimulated data.

  • Conference paper
    Hafezi S, Moore AH, Naylor PA, 2016,

    Multiple source localization in the spherical harmonic domain using augmented intensity vectors based on grid search

    , European Signal Processing Conference, Publisher: IEEE, ISSN: 2219-5491

    Multiple source localization is an important task in acousticsignal processing with applications including dereverberation,source separation, source tracking and environmentmapping. When using spherical microphone arrays, it hasbeen previously shown that Pseudo-intensity Vectors (PIV),and Augmented Intensity Vectors (AIV), are an effective approachfor direction of arrival estimation of a sound source.In this paper, we evaluate AIV-based localization in acousticscenarios involving multiple sound sources. Simulations areconducted where the number of sources, their angular separationand the reverberation time of the room are varied. Theresults indicate that AIV outperforms PIV and Steered ResponsePower (SRP) with an average accuracy between 5 and10 degrees for sources with angular separation of 30 degreesor more. AIV also shows better robustness to reverberationtime than PIV and SRP.

  • Conference paper
    Dorfan Y, Evers C, Gannot S, Naylor Pet al., 2016,

    Speaker Localization with Moving Microphone Arrays

    , European Signal Processing Conference (EUSIPCO), Publisher: IEEE, ISSN: 2076-1465

    Speaker localization algorithms often assume staticlocation for all sensors. This assumption simplifies the modelsused, since all acoustic transfer functions are linear time invariant.In many applications this assumption is not valid. Inthis paper we address the localization challenge with movingmicrophone arrays. We propose two algorithms to find thespeaker position. The first approach is a batch algorithm basedon the maximum likelihood criterion, optimized via expectationmaximizationiterations. The second approach is a particle filterfor sequential Bayesian estimation. The performance of bothapproaches is evaluated and compared for simulated reverberantaudio data from a microphone array with two sensors.

  • Conference paper
    Xue W, Brookes DM, Naylor PA, 2016,

    Under-modelled blind system identification for time delay estimation in reverberant environments

    , 15th International Workshop on Acoustic Signal Enhancement (IWAENC), Publisher: IEEE

    In multichannel systems, acoustic time delay estimation (TDE) is a challenging problem in reverberant environments. Although blind system identification (BSI) based methods have been proposed which utilize a realistic signal model for the room impulse response (RIR), their TDE performance depends strongly on that of the BSI, which is often inaccurate in practice when the identified responses are under-modelled. In this paper, we propose a new under-modelled BSI based method for TDE in reverberant environments. An under-modelled BSI algorithm is derived, which is based on maximizing the cross-correlation of the cross-filtered signals rather than minimizing the cross-relation error, and also exploits the sparsity of the early part of the RIR. For TDE, this new criterion can be viewed as a generalization of conventional cross-correlation-based TDE methods by considering a more realistic model for the early RIR. Depending on the microphone spacing, only a short early part of each RIR is identified, and the time delays are estimated based on the peak locations in the identified early RIRs. Experiments in different reverberant environments with speech source signals demonstrate the effectiveness of the proposed method.

  • Journal article
    Moore AH, Evers C, Naylor PA, 2016,

    Direction of Arrival Estimation in the Spherical Harmonic Domain using Subspace Pseudo-Intensity Vectors

    , IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol: 25, Pages: 178-192, ISSN: 2329-9290

    Direction of Arrival (DOA) estimation is a fundamental problem in acoustic signal processing. It is used in a diverse range of applications, including spatial filtering, speech dereverberation, source separation and diarization. Intensity vector-based DOA estimation is attractive, especially for spherical sensor arrays, because it is computationally efficient. Two such methods are presented which operate on a spherical harmonic decomposition of a sound field observed using a spherical microphone array. The first uses Pseudo-Intensity Vectors (PIVs) and works well in acoustic environments where only one sound source is active at any time. The second uses Subspace Pseudo-Intensity Vectors (SSPIVs) and is targeted at environments where multiple simultaneous sources and significant levels of reverberation make the problem more challenging. Analytical models are used to quantify the effects of an interfering source, diffuse noise and sensor noise on PIVs and SSPIVs. The accuracy of DOA estimation using PIVs and SSPIVs is compared against the state-of-the-art in simulations including realistic reverberation and noise for single and multiple, stationary and moving sources. Finally, robust performance of the proposed methods is demonstrated using speech recordings in real acoustic environments.

  • Conference paper
    Moore AH, Naylor P, 2016,

    Linear prediction based dereverberation for spherical microphone arrays

    , 15th International Workshop on Acoustic Signal Enhancement (IWAENC), Publisher: IEEE

    Dereverberation is an important preprocessing step in manyspeech systems, both for human and machine listening. Inmany situations, including robot audition, the sound sourcesof interest can be incident from any direction. In such circumstances,a spherical microphone array allows direction of arrivalestimation which is free of spatial aliasing and directionindependentbeam patterns can be formed. This contributionformulates the Weighted Prediction Error algorithm in thespherical harmonic domain and compares the performance toa space domain implementation. Simulation results demonstratethat performing dereverberation in the spherical harmonicdomain allows many more microphones to be usedwithout increasing the computational cost. The benefit ofusing many microphones is particularly apparent at low signalto noise ratios, where for the conditions tested up to 71%improvement in speech-to-reverberation modulation ratio wasachieved.

  • Journal article
    Naylor PA, Zahedi A, Jensen S, Bech Set al., 2016,

    Source Coding in Networks with Covariance Distortion Constraints

    , IEEE Transactions on Signal Processing, Vol: 64, Pages: 5943-5958, ISSN: 1053-587X

    We consider a source coding problem with a networkscenario in mind, and formulate it as a remote vectorGaussian Wyner-Ziv problem under covariance matrix distortions.We define a notion of minimum for two positive-definitematrices based on which we derive an explicit formula for therate-distortion function (RDF). We then study the special casesand applications of this result. We show that two well-studiedsource coding problems, i.e. remote vector Gaussian Wyner-Ziv problems with mean-squared error and mutual informationconstraints are in fact special cases of our results. Finally,we apply our results to a joint source coding and denoisingproblem. We consider a network with a centralized topology anda given weighted sum-rate constraint, where the received signalsat the center are to be fused to maximize the output SNR whileenforcing no linear distortion. We show that one can design thedistortion matrices at the nodes in order to maximize the outputSNR at the fusion center. We thereby bridge between denoisingand source coding within this setup.

  • Conference paper
    Xue W, Brookes M, Naylor PA, 2016,

    Cross-Correlation Based Under-Modelled Multichannel Blind Acoustic System Identification with Sparsity Regularization

    , 24th European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 718-722, ISSN: 2076-1465
  • Book
    Jarrett DP, Habets EAP, Naylor PA, 2016,

    Theory and Applications of Spherical Microphone Array Processing

    , Publisher: Springer, ISBN: 9783319422114

    This book presents the signal processing algorithms that have been developed to process the signals acquired by a spherical microphone array.

  • Journal article
    Eaton DJ, Gaubitch ND, Moore AH, Naylor PAet al., 2016,

    Estimation of room acoustic parameters: the ACE challenge

    , IEEE Transactions on Audio Speech and Language Processing, Vol: 24, Pages: 1681-1693, ISSN: 2329-9290

    Reverberation Time (T60) and Direct-to-Reverberant Ratio (DRR) are important parameters which together can characterize sound captured by microphones in non-anechoic rooms. These parameters are important in speech processing applications such as speech recognition and dereverberation. The values of T60 and DRR can be estimated directly from the Acoustic Impulse Response (AIR) of the room. In practice, the AIR isnot normally available, in which case these parameters must be estimated blindly from the observed speech in the microphone signal. The Acoustic Characterization of Environments (ACE) Challenge aimed to determine the state-of-the-art in blind acoustic parameter estimation and also to stimulate research in this area. A summary of the ACE Challenge, and the corpusused in the challenge is presented together with an analysis of the results. Existing algorithms were submitted alongside novel contributions, the comparative results for which are presented in this paper. The challenge showed that T60 estimation is a mature field where analytical approaches dominate whilst DRR estimation is a less mature field where machine learning approaches are currently more successful.

  • Patent
    Eaton DJ, Moore AH, Naylor PA, Skoglund Jet al., 2016,

    Reverberation estimator

    , US20160118038 A1

    Provided are methods and systems for generating Direct-to-Reverberant Ratio (DRR) estimates. The methods and systems use a null-steered beamformer to produce accurate DRR estimates across a variety of room sizes, reverberation times, and source-receiver distances. The DRR estimation algorithm uses spatial selectivity to separate direct and reverberant energy and account for noise separately. The formulation considers the response of the beamformer to reverberant sound and the effect of noise. The DRR estimation algorithm is more robust to background noise than existing approaches, and is applicable where a signal is recorded with two or more microphones, such as with mobile communications devices, laptop computers, and the like.

  • Journal article
    Sharma D, Naylor PA, Wang Y, Brookes DMet al., 2016,

    A Data-Driven Non-intrusive Measure of Speech Quality and Intelligibility

    , Speech Communication, Vol: 80, Pages: 84-94, ISSN: 0167-6393

    Speech signals are often affected by additive noiseand distortion which can degrade the perceived quality andintelligibility of the signal. We present a new measure, NISA, forestimating the quality and intelligibility of speech degraded byadditive noise and distortions associated with telecommunicationsnetworks, based on a data driven framework of feature extractionand tree based regression. The new measure is non-intrusive,operating on the degraded signal alone without the need for areference signal. This makes the measure applicable to practicalspeech processing applications operating in the single-endedmode. The new measure has been evaluated against the intrusivemeasures PESQ and STOI. The results indicate that the accuracyof the new non-intrusive method is around 90% of the accuracy ofthe intrusive measures, depending on the test scenario. The NISAmeasure therefore provides non-intrusive (single-ended) PESQand STOI estimates with high accuracy.

  • Conference paper
    Javed HA, Moore AH, Naylor PA, 2016,

    Spherical microphone array acoustic rake receivers

    , ICASSP, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, Publisher: IEEE, Pages: 111-115, ISSN: 0736-7791

    Several signal independent acoustic rake receivers are proposed for speech dereverberation using spherical microphone arrays. The proposed rake designs take advantage of multipaths, by separately capturing and combining early reflections with the direct path. We investigate several approaches in combining reflections with the direct path source signal, including the development of beam patterns that point nulls at all preceding reflections. The proposed designs are tested in experimental simulations and their dereverberation performances evaluated using objective measures. For the tested configuration, the proposed designs achieve higher levels of dereverberation compared to conventional signal independent beamforming systems; achieving up to 3.6 dB improvement in the direct-to-reverberant ratio over the plane-wave decomposition beamformer.

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-t4-html.jsp Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=1226&limit=20&page=5&respub-action=search.html Current Millis: 1634497772847 Current Time: Sun Oct 17 20:09:32 BST 2021