Search or filter publications

Filter by type:

Filter by publication type

Filter by year:

to

Results

  • Showing results for:
  • Reset all filters

Search results

  • Journal article
    Evers C, naylor PA, 2017,

    Optimized Self-Localization for SLAM in Dynamic Scenes using Probability Hypothesis Density Filters

    , IEEE Transactions on Signal Processing, Vol: 66, Pages: 863-878, ISSN: 1053-587X

    In many applications, sensors that map the positions of objects in unknown environments are installed on dynamic platforms. As measurements are relative to the observer's sensors, scene mapping requires accurate knowledge of the observer state. However, in practice, observer reports are subject to positioning errors. Simultaneous Localization and Mapping (SLAM) addresses the joint estimation problem of observer localization and scene mapping. State-of-the-art approaches typically use visual or optical sensors and therefore rely on static beacons in the environment to anchor the observer estimate. However, many applications involving sensors that are not conventionally used for SLAM are affected by highly dynamic scenes, such that the static world assumption is invalid. This paper proposes a novel approach for dynamic scenes, called GEneralized Motion (GEM)-SLAM. Based on Probability Hypothesis Density (PHD) filters, the proposed approach probabilistically anchors the observer state by fusing observer information inferred from the scene with reports of the observer motion. This paper derives the general, theoretical framework for GEM-SLAM and shows that it generalizes existing PHD-based SLAM algorithms. Simulations for a model-specific realization using range-bearing sensors and multiple moving objects highlight that GEM-SLAM achieves significant improvements over three benchmark algorithms.

  • Conference paper
    Papayiannis C, Evers C, Naylor PA, 2017,

    Sparse parametric modeling of the early part of acoustic impulse responses

    , 25th European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 678-682, ISSN: 2076-1465

    Acoustic channels are typically described by their Acoustic Impulse Response (AIR) as a Moving Average (MA) process. Such AIRs are often considered in terms of their early and late parts, describing discrete reflections and the diffuse reverberation tail respectively. We propose an approach for constructing a sparse parametric model for the early part. The model aims at reducing the number of parameters needed to represent it and subsequently reconstruct from the representation the MA coefficients that describe it. It consists of a representation of the reflections arriving at the receiver as delayed copies of an excitation signal. The Time-Of-Arrivals of reflections are not restricted to integer sample instances and a dynamically estimated model for the excitation sound is used. We also present a corresponding parameter estimation method, which is based on regularized-regression and nonlinear optimization. The proposed method also serves as an analysis tool, since estimated parameters can be used for the estimation of room geometry, the mixing time and other channel properties. Experiments involving simulated and measured AIRs are presented, in which the AIR coefficient reconstruction-error energy does not exceed 11.4% of the energy of the original AIR coefficients. The results also indicate dimensionality reduction figures exceeding 90% when compared to a MA process representation.

  • Journal article
    Forte AE, Etard O, Reichenbach J, 2017,

    The human auditory brainstem response to running speech reveals a subcortical mechanism for selective attention

    , eLife, Vol: 6, ISSN: 2050-084X

    Humans excel at selectively listening to a target speaker in background noise such as competing voices. While the encoding of speech in the auditory cortex is modulated by selective attention, it remains debated whether such modulation occurs already in subcortical auditory structures. Investigating the contribution of the human brainstem to attention has, in particular, been hindered by the tiny amplitude of the brainstem response. Its measurement normally requires a large number of repetitions of the same short sound stimuli, which may lead to a loss of attention and to neural adaptation. Here we develop a mathematical method to measure the auditory brainstem response to running speech, an acoustic stimulus that does not repeat and that has a high ecological validity. We employ this method to assess the brainstem's activity when a subject listens to one of two competing speakers, and show that the brainstem response is consistently modulated by attention.

  • Journal article
    Goodman DFM, Winter IM, Léger AC, de Cheveigné A, Lorenzi Cet al., 2017,

    Modelling firing regularity in the ventral cochlear nucleus: Mechanisms, and effects of stimulus level and synaptopathy

    , Hearing Research, Vol: 358, Pages: 98-110, ISSN: 0378-5955

    The auditory system processes temporal information at multiple scales, and disruptions to this temporal processing may lead to deficits in auditory tasks such as detecting and discriminating sounds in a noisy environment. Here, a modelling approach is used to study the temporal regularity of firing by chopper cells in the ventral cochlear nucleus, in both the normal and impaired auditory system. Chopper cells, which have a strikingly regular firing response, divide into two classes, sustained and transient, based on the time course of this regularity. Several hypotheses have been proposed to explain the behaviour of chopper cells, and the difference between sustained and transient cells in particular. However, there is no conclusive evidence so far. Here, a reduced mathematical model is developed and used to compare and test a wide range of hypotheses with a limited number of parameters. Simulation results show a continuum of cell types and behaviours: chopper-like behaviour arises for a wide range of parameters, suggesting that multiple mechanisms may underlie this behaviour. The model accounts for systematic trends in regularity as a function of stimulus level that have previously only been reported anecdotally. Finally, the model is used to predict the effects of a reduction in the number of auditory nerve fibres (deafferentation due to, for example, cochlear synaptopathy). An interactive version of this paper in which all the model parameters can be changed is available online.

  • Conference paper
    Forte AE, Etard O, Reichenbach J, 2017,

    Selective auditory attention modulates the human brainstem's response to running speech

    , Basic Auditory Science 2017
  • Conference paper
    Kegler M, Etard O, Forte AE, Reichenbach Jet al., 2017,

    Complex statistical model for detecting the auditory brainstem response to natural speech and for decoding attention

    , Basic Auditory Science 2017
  • Conference paper
    Isaac Engel J, Picinali L, 2017,

    Long-term user adaptation to an audio augmented reality system

    , SOUND AND VIBRATION. INTERNATIONAL CONGRESS. 24TH 2017

    Audio Augmented Reality (AAR) consists in extending a real auditory environment with virtual sound sources. This can be achieved using binaural earphones/microphones. The microphones, placed in the outer part of each earphone, record sounds from the user's environment, which are then mixed with virtual binaural audio, and the resulting signal is finally played back through the earphones. However, previous studies show that, with a system of this type, audio coming from the microphones (or hear-through audio) does not sound natural to the user. The goal of this study is to explore the capabilities of long-term user adaptation to an AAR system built with off-the-shelf components (a pair of binaural microphones/earphones and a smartphone), aiming at achieve perceived realism for the hear-through audio. To compensate the acoustical effects of ear canal occlusion, the recorded signal is equalised in the smartphone. In-out latency was minimised to avoid distortion caused by comb filtering effect. To evaluate the adaptation process of the users to the headset, two case studies were performed. The subjects wore an AAR headset for several days while performing daily tests to check the progress of the adaptation. Both quantitative and qualitative evaluations (i.e., localising real and virtual sound sources and analysing the perception of pre-recorded auditory scenes) were carried out, finding slight signs of adaptation, especially in the subjective tests. A demo will be available for the conference visitors, including also the integration of visual Augmented Reality functionalities.

  • Conference paper
    Parada PP, Sharma D, van Waterschoot T, Naylor PAet al., 2017,

    Robust Statistical Processing of TDOA Estimates for Distant Speaker Diarization

    , 25th European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 86-90, ISSN: 2076-1465
  • Conference paper
    Etard, Reichenbach J, 2017,

    EEG-measured correlates of comprehension in speech-in-noise listening

    , Basic Auditory Science 2017
  • Journal article
    Sidiras C, Iliadou V, Nimatoudis I, Reichenbach T, Bamiou D-Eet al., 2017,

    Spoken word recognition enhancement due to preceding synchronized beats compared to unsynchronized or unrhythmic beats

    , Frontiers in Neuroscience, Vol: 11, ISSN: 1662-4548

    The relation between rhythm and language has been investigated over the last decades, with evidence that these share overlapping perceptual mechanisms emerging from several different strands of research. The dynamic Attention Theory posits that neural entrainment to musical rhythm results in synchronized oscillations in attention, enhancing perception of other events occurring at the same rate. In this study, this prediction was tested in 10 year-old children by means of a psychoacoustic speech recognition in babble paradigm. It was hypothesized that rhythm effects evoked via a short isochronous sequence of beats would provide optimal word recognition in babble when beats and word are in sync. We compared speech recognition in babble performance in the presence of isochronous and in sync vs. non-isochronous or out of sync sequence of beats. Results showed that (a) word recognition was the best when rhythm and word were in sync, and (b) the effect was not uniform across syllables and gender of subjects. Our results suggest that pure tone beats affect speech recognition at early levels of sensory or phonemic processing.

  • Conference paper
    Evers C, Dorfan Y, Gannot S, Naylor PAet al., 2017,

    Source tracking using moving microphone arrays for robot audition

    , IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE

    Intuitive spoken dialogues are a prerequisite for human-robot inter-action. In many practical situations, robots must be able to identifyand focus on sources of interest in the presence of interfering speak-ers. Techniques such as spatial filtering and blind source separa-tion are therefore often used, but rely on accurate knowledge of thesource location. In practice, sound emitted in enclosed environmentsis subject to reverberation and noise. Hence, sound source localiza-tion must be robust to both diffuse noise due to late reverberation, aswell as spurious detections due to early reflections. For improvedrobustness against reverberation, this paper proposes a novel ap-proach for sound source tracking that constructively exploits the spa-tial diversity of a microphone array installed in a moving robot. Inprevious work, we developed speaker localization approaches usingexpectation-maximization (EM) approaches and using Bayesian ap-proaches. In this paper we propose to combine the EM and Bayesianapproach in one framework for improved robustness against rever-beration and noise.

  • Conference paper
    Lightburn L, De Sena E, Moore AH, Naylor PA, Brookes Det al., 2017,

    Improving the perceptual quality of ideal binary masked speech

    , 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: Institute of Electrical and Electronics Engineers (IEEE), Pages: 661-665, ISSN: 1520-6149

    It is known that applying a time-frequency binary mask to very noisy speech can improve its intelligibility but results in poor perceptual quality. In this paper we propose a new approach to applying a binary mask that combines the intelligibility gains of conventional binary masking with the perceptual quality gains of a classical speech enhancer. The binary mask is not applied directly as a time-frequency gain as in most previous studies. Instead, the mask is used to supply prior information to a classical speech enhancer about the probability of speech presence in different time-frequency regions. Using an oracle ideal binary mask, we show that the proposed method results in a higher predicted quality than other methods of applying a binary mask whilst preserving the improvements in predicted intelligibility.

  • Journal article
    Ciganovic N, Wolde-Kidan A, Reichenbach JDT, 2017,

    Hair bundles of cochlear outer hair cells are shaped to minimize their fluid-dynamic resistance

    , Scientific Reports, Vol: 7, ISSN: 2045-2322

    The mammalian sense of hearing relies on two types of sensory cells: inner hair cells transmit the auditory stimulus to the brain, while outer hair cells mechanically modulate the stimulus through active feedback. Stimulation of a hair cell is mediated by displacements of its mechanosensitive hair bundle which protrudes from the apical surface of the cell into a narrow fluid-filled space between reticular lamina and tectorial membrane. While hair bundles of inner hair cells are of linear shape, those of outer hair cells exhibit a distinctive V-shape. The biophysical rationale behind this morphology, however, remains unknown. Here we use analytical and computational methods to study the fluid flow across rows of differently shaped hair bundles. We find that rows of V-shaped hair bundles have a considerably reduced resistance to crossflow, and that the biologically observed shapes of hair bundles of outer hair cells are near-optimal in this regard. This observation accords with the function of outer hair cells and lends support to the recent hypothesis that inner hair cells are stimulated by a net flow, in addition to the well-established shear flow that arises from shearing between the reticular lamina and the tectorial membrane.

  • Conference paper
    Picinali L, Wallin A, Levtov Y, Poirier-Quinot Det al., 2017,

    Comparative perceptual evaluation between different methods for implementing Reverberation in a binaural context

    , AES 2017, Publisher: Audio Engineering Society

    Reverberation has always been considered of primary importance in order to improve the realism, externalisation and immersiveness of binaurally spatialised sounds. Different techniques exist for implementing reverberation in a binaural context, each with a different level of computational complexity and spatial accuracy. A perceptual study has been performed in order to compare between the realism and localization accuracy achieved using 5 different binaural reverberation techniques. These included multichannel Ambisonic-based, stereo and mono reverberation methods. A custom web-based application has been developed implementing the testing procedures, and allowing participants to take the test remotely. Initial results with 54 participants show that no major difference in terms of perceived level of realism and spatialisation accuracy could be found between four of the five proposed reverberation methods, suggesting that a high level of complexity in the reverberation process does not always correspond to improved perceptual attributes.

  • Journal article
    Doire CSJ, Brookes DM, Naylor PA, 2017,

    Robust and efficient Bayesian adaptive psychometric function estimation

    , Journal of the Acoustical Society of America, Vol: 141, Pages: 2501-2512, ISSN: 0001-4966

    The efficient measurement of the threshold and slope of the psychometric function (PF) is an important objective in psychoacoustics. This paper proposes a procedure that combines a Bayesian estimate of the PF with either a look one-ahead or a look two-ahead method of selecting the next stimulus presentation. The procedure differs from previously proposed algorithms in two respects: (i) it does not require the range of possible PF parameters to be specified in advance and (ii) the sequence of probe signal-to-noise ratios optimizes the threshold and slope estimates at a performance level, ϕ, that can be chosen by the experimenter. Simulation results show that the proposed procedure is robust and that the estimates of both threshold and slope have a consistently low bias. Over a wide range of listener PF parameters, the root-mean-square errors after 50 trials were ∼1.2 dB in threshold and 0.14 in log-slope. It was found that the performance differences between the look one-ahead and look two-ahead methods were negligible and that an entropy-based criterion for selecting the next stimulus was preferred to a variance-based criterion.

  • Conference paper
    Pinero G, Naylor PA, 2017,

    CHANNEL ESTIMATION FOR CROSSTALK CANCELLATION IN WlRELESS ACOUSTIC NETWORKS

    , IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 586-590, ISSN: 1520-6149
  • Conference paper
    Javed HA, Cauchi B, Doclo S, Naylor PA, Goetze Set al., 2017,

    MEASURING, MODELLING AND PREDICTING PERCEIVED REVERBERATION

    , IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 381-385, ISSN: 1520-6149
  • Conference paper
    Forte AE, Etard O, Reichenbach J, 2017,

    Complex Auditory-brainstem Response to the Fundamental Frequency of Continuous Natural Speech

    , ARO 2017
  • Conference paper
    Evers C, Moore A, Naylor P, 2016,

    Localization of Moving Microphone Arrays from Moving Sound Sources for Robot Audition

    , European Signal Processing Conference (EUSIPCO), Publisher: IEEE, ISSN: 2076-1465

    Acoustic Simultaneous Localization and Mapping(a-SLAM) jointly localizes the trajectory of a microphone arrayinstalled on a moving platform, whilst estimating the acousticmap of surrounding sound sources, such as human speakers.Whilst traditional approaches for SLAM in the vision and opticalresearch literature rely on the assumption that the surroundingmap features are static, in the acoustic case the positions oftalkers are usually time-varying due to head rotations and bodymovements. This paper demonstrates that tracking of movingsources can be incorporated in a-SLAM by modelling the acousticmap as a Random Finite Set (RFS) of multiple sources andexplicitly imposing models of the source dynamics. The proposedapproach is verified and its performance evaluated for realisticsimulated data.

  • Journal article
    Moore AH, Evers C, Naylor PA, 2016,

    Direction of Arrival Estimation in the Spherical Harmonic Domain using Subspace Pseudo-Intensity Vectors

    , IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol: 25, Pages: 178-192, ISSN: 2329-9290

    Direction of Arrival (DOA) estimation is a fundamental problem in acoustic signal processing. It is used in a diverse range of applications, including spatial filtering, speech dereverberation, source separation and diarization. Intensity vector-based DOA estimation is attractive, especially for spherical sensor arrays, because it is computationally efficient. Two such methods are presented which operate on a spherical harmonic decomposition of a sound field observed using a spherical microphone array. The first uses Pseudo-Intensity Vectors (PIVs) and works well in acoustic environments where only one sound source is active at any time. The second uses Subspace Pseudo-Intensity Vectors (SSPIVs) and is targeted at environments where multiple simultaneous sources and significant levels of reverberation make the problem more challenging. Analytical models are used to quantify the effects of an interfering source, diffuse noise and sensor noise on PIVs and SSPIVs. The accuracy of DOA estimation using PIVs and SSPIVs is compared against the state-of-the-art in simulations including realistic reverberation and noise for single and multiple, stationary and moving sources. Finally, robust performance of the proposed methods is demonstrated using speech recordings in real acoustic environments.

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-t4-html.jsp Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=1047&limit=20&page=2&respub-action=search.html Current Millis: 1606973950978 Current Time: Thu Dec 03 05:39:10 GMT 2020