- Showing results for:
- Reset all filters
Journal articleDietz M, Lestang J-H, Majdak P, et al., 2017,
Auditory research has a rich history of combining experimental evidence with computational simulations of auditory processing in order to deepen our theoretical understanding of how sound is processed in the ears and in the brain. Despite significant progress in the amount of detail and breadth covered by auditory models, for many components of the auditory pathway there are still different model approaches that are often not equivalent but rather in conflict with each other. Similarly, some experimental studies yield conflicting results which has led to controversies. This can be best resolved by a systematic comparison of multiple experimental data sets and model approaches. Binaural processing is a prominent example of how the development of quantitative theories can advance our understanding of the phenomena, but there remain several unresolved questions for which competing model approaches exist. This article discusses a number of current unresolved or disputed issues in binaural modelling, as well as some of the significant challenges in comparing binaural models with each other and with the experimental data. We introduce an auditory model framework, which we believe can become a useful infrastructure for resolving some of the current controversies. It operates models over the same paradigms that are used experimentally. The core of the proposed framework is an interface that connects three components irrespective of their underlying programming language: The experiment software, an auditory pathway model, and task-dependent decision stages called artificial observers that provide the same output format as the test subject.
Conference paperPapayiannis C, Evers C, Naylor PA, 2017,
Acoustic channels are typically described by their Acoustic Impulse Response (AIR) as a Moving Average (MA) process. Such AIRs are often considered in terms of their early and late parts, describing discrete reflections and the diffuse reverberation tail respectively. We propose an approach for constructing a sparse parametric model for the early part. The model aims at reducing the number of parameters needed to represent it and subsequently reconstruct from the representation the MA coefficients that describe it. It consists of a representation of the reflections arriving at the receiver as delayed copies of an excitation signal. The Time-Of-Arrivals of reflections are not restricted to integer sample instances and a dynamically estimated model for the excitation sound is used. We also present a corresponding parameter estimation method, which is based on regularized-regression and nonlinear optimization. The proposed method also serves as an analysis tool, since estimated parameters can be used for the estimation of room geometry, the mixing time and other channel properties. Experiments involving simulated and measured AIRs are presented, in which the AIR coefficient reconstruction-error energy does not exceed 11.4% of the energy of the original AIR coefficients. The results also indicate dimensionality reduction figures exceeding 90% when compared to a MA process representation.
Journal articleForte AE, Etard O, Reichenbach J, 2017,
The human auditory brainstem response to running speech reveals a subcortical mechanism for selective attention, eLife, Vol: 6, ISSN: 2050-084X
Humans excel at selectively listening to a target speaker in background noise such as competing voices. While the encoding of speech in the auditory cortex is modulated by selective attention, it remains debated whether such modulation occurs already in subcortical auditory structures. Investigating the contribution of the human brainstem to attention has, in particular, been hindered by the tiny amplitude of the brainstem response. Its measurement normally requires a large number of repetitions of the same short sound stimuli, which may lead to a loss of attention and to neural adaptation. Here we develop a mathematical method to measure the auditory brainstem response to running speech, an acoustic stimulus that does not repeat and that has a high ecological validity. We employ this method to assess the brainstem's activity when a subject listens to one of two competing speakers, and show that the brainstem response is consistently modulated by attention.
Journal articleGoodman DFM, Winter IM, Léger AC, et al., 2017,
Modelling firing regularity in the ventral cochlear nucleus: Mechanisms, and effects of stimulus level and synaptopathy, Hearing Research, Vol: 358, Pages: 98-110, ISSN: 0378-5955
The auditory system processes temporal information at multiple scales, and disruptions to this temporal processing may lead to deficits in auditory tasks such as detecting and discriminating sounds in a noisy environment. Here, a modelling approach is used to study the temporal regularity of firing by chopper cells in the ventral cochlear nucleus, in both the normal and impaired auditory system. Chopper cells, which have a strikingly regular firing response, divide into two classes, sustained and transient, based on the time course of this regularity. Several hypotheses have been proposed to explain the behaviour of chopper cells, and the difference between sustained and transient cells in particular. However, there is no conclusive evidence so far. Here, a reduced mathematical model is developed and used to compare and test a wide range of hypotheses with a limited number of parameters. Simulation results show a continuum of cell types and behaviours: chopper-like behaviour arises for a wide range of parameters, suggesting that multiple mechanisms may underlie this behaviour. The model accounts for systematic trends in regularity as a function of stimulus level that have previously only been reported anecdotally. Finally, the model is used to predict the effects of a reduction in the number of auditory nerve fibres (deafferentation due to, for example, cochlear synaptopathy). An interactive version of this paper in which all the model parameters can be changed is available online.
Conference paperForte AE, Etard O, Reichenbach J, 2017,
Selective auditory attention modulates the human brainstem's response to running speech, Basic Auditory Science 2017
Conference paperKegler M, Etard O, Forte AE, et al., 2017,
Complex statistical model for detecting the auditory brainstem response to natural speech and for decoding attention, Basic Auditory Science 2017
Conference paperIsaac Engel J, Picinali L, 2017,
Long-term user adaptation to an audio augmented reality system, SOUND AND VIBRATION. INTERNATIONAL CONGRESS. 24TH 2017
Audio Augmented Reality (AAR) consists in extending a real auditory environment with virtual sound sources. This can be achieved using binaural earphones/microphones. The microphones, placed in the outer part of each earphone, record sounds from the user's environment, which are then mixed with virtual binaural audio, and the resulting signal is finally played back through the earphones. However, previous studies show that, with a system of this type, audio coming from the microphones (or hear-through audio) does not sound natural to the user. The goal of this study is to explore the capabilities of long-term user adaptation to an AAR system built with off-the-shelf components (a pair of binaural microphones/earphones and a smartphone), aiming at achieve perceived realism for the hear-through audio. To compensate the acoustical effects of ear canal occlusion, the recorded signal is equalised in the smartphone. In-out latency was minimised to avoid distortion caused by comb filtering effect. To evaluate the adaptation process of the users to the headset, two case studies were performed. The subjects wore an AAR headset for several days while performing daily tests to check the progress of the adaptation. Both quantitative and qualitative evaluations (i.e., localising real and virtual sound sources and analysing the perception of pre-recorded auditory scenes) were carried out, finding slight signs of adaptation, especially in the subjective tests. A demo will be available for the conference visitors, including also the integration of visual Augmented Reality functionalities.
Conference paperParada PP, Sharma D, van Waterschoot T, et al., 2017,
Robust Statistical Processing of TDOA Estimates for Distant Speaker Diarization, 25th European Signal Processing Conference (EUSIPCO), Publisher: IEEE, Pages: 86-90, ISSN: 2076-1465
Conference paperEtard, Reichenbach J, 2017,
EEG-measured correlates of comprehension in speech-in-noise listening, Basic Auditory Science 2017
Journal articleSidiras C, Iliadou V, Nimatoudis I, et al., 2017,
Spoken word recognition enhancement due to preceding synchronized beats compared to unsynchronized or unrhythmic beats, Frontiers in Neuroscience, Vol: 11, ISSN: 1662-4548
The relation between rhythm and language has been investigated over the last decades, with evidence that these share overlapping perceptual mechanisms emerging from several different strands of research. The dynamic Attention Theory posits that neural entrainment to musical rhythm results in synchronized oscillations in attention, enhancing perception of other events occurring at the same rate. In this study, this prediction was tested in 10 year-old children by means of a psychoacoustic speech recognition in babble paradigm. It was hypothesized that rhythm effects evoked via a short isochronous sequence of beats would provide optimal word recognition in babble when beats and word are in sync. We compared speech recognition in babble performance in the presence of isochronous and in sync vs. non-isochronous or out of sync sequence of beats. Results showed that (a) word recognition was the best when rhythm and word were in sync, and (b) the effect was not uniform across syllables and gender of subjects. Our results suggest that pure tone beats affect speech recognition at early levels of sensory or phonemic processing.
Conference paperEvers C, Dorfan Y, Gannot S, et al., 2017,
Intuitive spoken dialogues are a prerequisite for human-robot inter-action. In many practical situations, robots must be able to identifyand focus on sources of interest in the presence of interfering speak-ers. Techniques such as spatial filtering and blind source separa-tion are therefore often used, but rely on accurate knowledge of thesource location. In practice, sound emitted in enclosed environmentsis subject to reverberation and noise. Hence, sound source localiza-tion must be robust to both diffuse noise due to late reverberation, aswell as spurious detections due to early reflections. For improvedrobustness against reverberation, this paper proposes a novel ap-proach for sound source tracking that constructively exploits the spa-tial diversity of a microphone array installed in a moving robot. Inprevious work, we developed speaker localization approaches usingexpectation-maximization (EM) approaches and using Bayesian ap-proaches. In this paper we propose to combine the EM and Bayesianapproach in one framework for improved robustness against rever-beration and noise.
Conference paperLightburn L, De Sena E, Moore AH, et al., 2017,
It is known that applying a time-frequency binary mask to very noisy speech can improve its intelligibility but results in poor perceptual quality. In this paper we propose a new approach to applying a binary mask that combines the intelligibility gains of conventional binary masking with the perceptual quality gains of a classical speech enhancer. The binary mask is not applied directly as a time-frequency gain as in most previous studies. Instead, the mask is used to supply prior information to a classical speech enhancer about the probability of speech presence in different time-frequency regions. Using an oracle ideal binary mask, we show that the proposed method results in a higher predicted quality than other methods of applying a binary mask whilst preserving the improvements in predicted intelligibility.
Journal articleCiganovic N, Wolde-Kidan A, Reichenbach JDT, 2017,
The mammalian sense of hearing relies on two types of sensory cells: inner hair cells transmit the auditory stimulus to the brain, while outer hair cells mechanically modulate the stimulus through active feedback. Stimulation of a hair cell is mediated by displacements of its mechanosensitive hair bundle which protrudes from the apical surface of the cell into a narrow fluid-filled space between reticular lamina and tectorial membrane. While hair bundles of inner hair cells are of linear shape, those of outer hair cells exhibit a distinctive V-shape. The biophysical rationale behind this morphology, however, remains unknown. Here we use analytical and computational methods to study the fluid flow across rows of differently shaped hair bundles. We find that rows of V-shaped hair bundles have a considerably reduced resistance to crossflow, and that the biologically observed shapes of hair bundles of outer hair cells are near-optimal in this regard. This observation accords with the function of outer hair cells and lends support to the recent hypothesis that inner hair cells are stimulated by a net flow, in addition to the well-established shear flow that arises from shearing between the reticular lamina and the tectorial membrane.
Conference paperPicinali L, Wallin A, Levtov Y, et al., 2017,
Comparative perceptual evaluation between different methods for implementing Reverberation in a binaural context, AES 2017, Publisher: Audio Engineering Society
Reverberation has always been considered of primary importance in order to improve the realism, externalisation and immersiveness of binaurally spatialised sounds. Different techniques exist for implementing reverberation in a binaural context, each with a different level of computational complexity and spatial accuracy. A perceptual study has been performed in order to compare between the realism and localization accuracy achieved using 5 different binaural reverberation techniques. These included multichannel Ambisonic-based, stereo and mono reverberation methods. A custom web-based application has been developed implementing the testing procedures, and allowing participants to take the test remotely. Initial results with 54 participants show that no major difference in terms of perceived level of realism and spatialisation accuracy could be found between four of the five proposed reverberation methods, suggesting that a high level of complexity in the reverberation process does not always correspond to improved perceptual attributes.
Journal articleDoire CSJ, Brookes DM, Naylor PA, 2017,
The efficient measurement of the threshold and slope of the psychometric function (PF) is an important objective in psychoacoustics. This paper proposes a procedure that combines a Bayesian estimate of the PF with either a look one-ahead or a look two-ahead method of selecting the next stimulus presentation. The procedure differs from previously proposed algorithms in two respects: (i) it does not require the range of possible PF parameters to be specified in advance and (ii) the sequence of probe signal-to-noise ratios optimizes the threshold and slope estimates at a performance level, ϕ, that can be chosen by the experimenter. Simulation results show that the proposed procedure is robust and that the estimates of both threshold and slope have a consistently low bias. Over a wide range of listener PF parameters, the root-mean-square errors after 50 trials were ∼1.2 dB in threshold and 0.14 in log-slope. It was found that the performance differences between the look one-ahead and look two-ahead methods were negligible and that an entropy-based criterion for selecting the next stimulus was preferred to a variance-based criterion.
Conference paperPinero G, Naylor PA, 2017,
CHANNEL ESTIMATION FOR CROSSTALK CANCELLATION IN WlRELESS ACOUSTIC NETWORKS, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 586-590, ISSN: 1520-6149
Conference paperJaved HA, Cauchi B, Doclo S, et al., 2017,
MEASURING, MODELLING AND PREDICTING PERCEIVED REVERBERATION, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Publisher: IEEE, Pages: 381-385, ISSN: 1520-6149
Conference paperForte AE, Etard O, Reichenbach J, 2017,
Complex Auditory-brainstem Response to the Fundamental Frequency of Continuous Natural Speech, ARO 2017
BookJarrett DP, Habets EAP, Naylor PA, 2017,
Conference paperEvers C, Moore A, Naylor P, 2016,
Acoustic Simultaneous Localization and Mapping(a-SLAM) jointly localizes the trajectory of a microphone arrayinstalled on a moving platform, whilst estimating the acousticmap of surrounding sound sources, such as human speakers.Whilst traditional approaches for SLAM in the vision and opticalresearch literature rely on the assumption that the surroundingmap features are static, in the acoustic case the positions oftalkers are usually time-varying due to head rotations and bodymovements. This paper demonstrates that tracking of movingsources can be incorporated in a-SLAM by modelling the acousticmap as a Random Finite Set (RFS) of multiple sources andexplicitly imposing models of the source dynamics. The proposedapproach is verified and its performance evaluated for realisticsimulated data.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.