83 results found
Engel Alonso Martinez J, Goodman D, Picinali L, 2021, Assessing HRTF preprocessing methods for Ambisonics rendering through perceptual models, Acta Acustica -Peking-, ISSN: 0371-0025
Binaural rendering of Ambisonics signals is a common way to reproduce spatial audio content. Processing Ambisonics signals at low spatial orders is desirable in order to reduce complexity, although it may degrade the perceived quality, in part due to the mismatch that occurs when a low-order Ambisonics signal is paired with a spatially dense head-related transfer function (HRTF). In order to alleviate this issue, the HRTF may be preprocessed so its spatial order is reduced. Several preprocessing methods have been proposed, but they have not been thoroughly compared yet. In this study, nine HRTF preprocessing methods were used to render anechoic binaural signals from Ambisonics representations of orders 1 to 44, and these were compared through perceptual hearing models in terms of localisation performance, externalisation and speech reception. This assessment was supported by numerical analyses of HRTF interpolation errors, interaural differences, perceptually-relevant spectral differences, and loudness stability. Models predicted that the binaural renderings’ accuracy increased with spatial order, as expected. A notable effect of the preprocessing method was observed: whereas all methods performed similarly at the highest spatial orders, some were considerably better at lower orders. A newly proposed method, BiMagLS, displayed the best performance overall and is recommended for the rendering of bilateral Ambisonics signals. The results, which were in line with previous literature, indirectly validate the perceptual models’ ability to predict listeners’ responses in a consistent and explicable manner.
Sethi SS, Ewers RM, Jones NS, et al., 2021, Soundscapes predict species occurrence in tropical forests, OIKOS, Pages: 1-9, ISSN: 0030-1299
Accurate occurrence data is necessary for the conservation of keystone or endangered species, but acquiring it is usually slow, laborious and costly. Automated acoustic monitoring offers a scalable alternative to manual surveys but identifying species vocalisations requires large manually annotated training datasets, and is not always possible (e.g. for lesser studied or silent species). A new approach is needed that rapidly predicts species occurrence using smaller and more coarsely labelled audio datasets. We investigated whether local soundscapes could be used to infer the presence of 32 avifaunal and seven herpetofaunal species in 20 min recordings across a tropical forest degradation gradient in Sabah, Malaysia. Using acoustic features derived from a convolutional neural network (CNN), we characterised species indicative soundscapes by training our models on a temporally coarse labelled point-count dataset. Soundscapes successfully predicted the occurrence of 34 out of the 39 species across the two taxonomic groups, with area under the curve (AUC) metrics from 0.53 up to 0.87. The highest accuracies were achieved for species with strong temporal occurrence patterns. Soundscapes were a better predictor of species occurrence than above-ground carbon density – a metric often used to quantify habitat quality across forest degradation gradients. Our results demonstrate that soundscapes can be used to efficiently predict the occurrence of a wide variety of species and provide a new direction for data driven large-scale assessments of habitat suitability.
Vickers D, Salorio-Corbetto M, Driver S, et al., 2021, Involving children and teenagers with bilateral cochlear implants in the design of the BEARS (Both EARS) virtual reality training suite improves personalization, Frontiers in Digital Health, Vol: 3, ISSN: 2673-253X
Older children and teenagers with bilateral cochlear implants often have poor spatial hearing because they cannot fuse sounds from the two ears. This deficit jeopardizes speech and language development, education, and social well-being. The lack of protocols for fitting bilateral cochlear implants and resources for spatial-hearing training contribute to these difficulties. Spatial hearing develops with bilateral experience. A large body of research demonstrates that sound localisation can improve with training, underpinned by plasticity-driven changes in the auditory pathways. Generalizing training to non-trained auditory skills is best achieved by using a multi-modal (audio-visual) implementation and multi-domain training tasks (localisation, speech-in-noise, and spatial music). The goal of this work was to develop a package of virtual-reality games (BEARS, Both EARS) to train spatial hearing in young people (8–16 years) with bilateral cochlear implants using an action-research protocol. The action research protocol used formalized cycles for participants to trial aspects of the BEARS suite, reflect on their experiences, and in turn inform changes in the game implementations. This participatory design used the stakeholder participants as co-creators. The cycles for each of the three domains (localisation, spatial speech-in-noise, and spatial music) were customized to focus on the elements that the stakeholder participants considered important. The participants agreed that the final games were appropriate and ready to be used by patients. The main areas of modification were: the variety of immersive scenarios to cover age range and interests, the number of levels of complexity to ensure small improvements were measurable, feedback, and reward schemes to ensure positive reinforcement, and an additional implementation on an iPad for those who had difficulties with the headsets due to age or balance issues. The effectiveness of the BEARS training suite will be ev
Setti W, Cuturi LF, Engel I, et al., 2021, The Influence of Early Visual Deprivation on Audio-Spatial Working Memory, NEUROPSYCHOLOGY, ISSN: 0894-4105
Engel Alonso Martinez J, Goodman DFM, Picinali L, 2021, Improving Binaural Rendering with Bilateral Ambisonics and MagLS, DAGA 2021
Heath BE, Orme DS, Sethi CSL, et al., 2021, How index selection, compression, and recording schedule impact the description of ecological soundscapes, Evolutionary Ecology, Vol: 11, Pages: 13206-13217, ISSN: 0269-7653
Acoustic indices derived from environmental soundscape recordings are being used to monitor ecosystem health and vocal animal biodiversity. Soundscape data can quickly become very expensive and difficult to manage, so data compression or temporal down-sampling are sometimes employed to reduce data storage and transmission costs. These parameters vary widely between experiments, with the consequences of this variation remaining mostly unknown.We analyse field recordings from North-Eastern Borneo across a gradient of historical land use. We quantify the impact of experimental parameters (MP3 compression, recording length and temporal subsetting) on soundscape descriptors (Analytical Indices and a convolutional neural net derived AudioSet Fingerprint). Both descriptor types were tested for their robustness to parameter alteration and their usability in a soundscape classification task.We find that compression and recording length both drive considerable variation in calculated index values. However, we find that the effects of this variation and temporal subsetting on the performance of classification models is minor: performance is much more strongly determined by acoustic index choice, with Audioset fingerprinting offering substantially greater (12%–16%) levels of classifier accuracy, precision and recall.We advise using the AudioSet Fingerprint in soundscape analysis, finding superior and consistent performance even on small pools of data. If data storage is a bottleneck to a study, we recommend Variable Bit Rate encoded compression (quality = 0) to reduce file size to 23% file size without affecting most Analytical Index values. The AudioSet Fingerprint can be compressed further to a Constant Bit Rate encoding of 64 kb/s (8% file size) without any detectable effect. These recommendations allow the efficient use of restricted data storage whilst permitting comparability of results between different studies.
Lim V, Khan S, Picinali L, 2021, Towards a more accessible cultural heritage: challenges and opportunities in contextualization using 3D sound narratives, Applied Sciences, ISSN: 2076-3417
Cuevas-Rodriguez M, Gonzalez-Toledo D, Reyes-Lecuona A, et al., 2021, Impact of non-individualised head related transfer functions on speech-in-noise performances within a synthesised virtual environment, The Journal of the Acoustical Society of America, Vol: 149, Pages: 2573-2586, ISSN: 0001-4966
When performing binaural spatialisation, it is widely accepted that the choice of the head related transfer functions (HRTFs), and in particular the use of individually measured ones, can have an impact on localisation accuracy, externalization, and overall realism. Yet the impact of HRTF choices on speech-in-noise performances in cocktail party-like scenarios has not been investigated in depth. This paper introduces a study where 22 participants were presented with a frontal speech target and two lateral maskers, spatialised using a set of non-individual HRTFs. Speech reception threshold (SRT) was measured for each HRTF. Furthermore, using the SRT predicted by an existing speech perception model, the measured values were compensated in the attempt to remove overall HRTF-specific benefits. Results show significant overall differences among the SRTs measured using different HRTFs, consistently with the results predicted by the model. Individual differences between participants related to their SRT performances using different HRTFs could also be found, but their significance was reduced after the compensation. The implications of these findings are relevant to several research areas related to spatial hearing and speech perception, suggesting that when testing speech-in-noise performances within binaurally rendered virtual environments, the choice of the HRTF for each individual should be carefully considered.
Comunita M, Gerino A, Lim V, et al., 2021, Design and Evaluation of a Web- and Mobile-based Binaural Audio Platform for Cultural Heritage, Applied Sciences, Vol: 11, ISSN: 2076-3417
PlugSonic is a suite of web- and mobile-based applications for the curation and experience of 3D interactive soundscapes and sonic narratives in the cultural heritage context. It was developed as part of the PLUGGY EU project (Pluggable Social Platform for Heritage Awareness and Participation) and consists of two main applications: PlugSonic Sample, to edit and apply audio effects, and PlugSonic Soundscape, to create and experience 3D soundscapes for headphones playback. The audio processing within PlugSonic is based on the Web Audio API and the 3D Tune-In Toolkit, while the mobile exploration of soundscapes in a physical space is obtained using Apple’s ARKit. The main goal of PlugSonic is technology democratisation; PlugSonic users—whether cultural institutions or citizens—are all given the instruments needed to create, process and experience 3D soundscapes and sonic narratives; without the need for specific devices, external tools (software and/or hardware), specialised knowledge or custom development. The aims of this paper are to present the design and development choices, the user involvement processes as well as a final evaluation conducted with inexperienced users on three tasks (creation, curation and experience), demonstrating how PlugSonic is indeed a simple, effective, yet powerful tool.
Engel Alonso-Martinez I, Henry C, Amengual Garí SV, et al., 2021, Perceptual implications of different Ambisonics-based methods for binaural reverberation, Journal of the Acoustical Society of America, Vol: 149, ISSN: 0001-4966
Reverberation is essential for the realistic auralisation of enclosed spaces. However, it can be computationally expensive to render with high fidelity and, in practice, simplified models are typically used to lower costs while preserving perceived quality. Ambisonics-based methods may be employed to this purpose as they allow us to render a reverberant sound field more efficiently by limiting its spatial resolution. The present study explores the perceptual impact of two simplifications of Ambisonics-based binaural reverberation that aim to improve efficiency. First, a “hybrid Ambisonics” approach is proposed in which the direct sound path is generated by convolution with a spatially dense head related impulse response set, separately from reverberation. Second, the reverberant virtual loudspeaker method (RVL) is presented as a computationally efficient approach to dynamically render binaural reverberation for multiple sources with the potential limitation of inaccurately simulating listener's head rotations. Numerical and perceptual evaluations suggest that the perceived quality of hybrid Ambisonics auralisations of two measured rooms ceased to improve beyond the third order, which is a lower threshold than what was found by previous studies in which the direct sound path was not processed separately. Additionally, RVL is shown to produce auralisations with comparable perceived quality to Ambisonics renderings.
Hallewell M, Patel H, Salanitri D, et al., 2021, Play&tune: user feedback in the development of a serious game for optimizing hearing aid orientation, Ergonomics in Design: The Quarterly of Human Factors Applications, Vol: 29, Pages: 14-24, ISSN: 1064-8046
Many hearing aid (HA) users are dissatisfied with HA performance in social situations. One way to improve HA outcomes is training the users to understand how HAs work. Play&Tune was designed to provide this training and to foster autonomy in hearing rehabilitation. We carried out two prototype evaluations and a prerelease evaluation of Play&Tune with 71 HA users, using an interview or online survey. Users gave detailed feedback on their experiences with the app. Most participants enjoyed learning about HAs and expressed a desire for autonomy over their HA settings. Our case study reinforces the importance of user feedback during app development.
Setti W, Engel IAM, Cuturi LF, et al., 2021, The Audio-Corsi: an acoustic virtual reality-based technological solution for evaluating audio-spatial memory abilities, Journal on Multimodal User Interfaces, ISSN: 1783-7677
Spatial memory is a cognitive skill that allows the recall of information about the space, its layout, and items’ locations. We present a novel application built around 3D spatial audio technology to evaluate audio-spatial memory abilities. The sound sources have been spatially distributed employing the 3D Tune-In Toolkit, a virtual acoustic simulator. The participants are presented with sequences of sounds of increasing length emitted from virtual auditory sources around their heads. To identify stimuli positions and register the test responses, we designed a custom-made interface with buttons arranged according to sound locations. We took inspiration from the Corsi-Block test for the experimental procedure, a validated clinical approach for assessing visuo-spatial memory abilities. In two different experimental sessions, the participants were tested with the classical Corsi-Block and, blindfolded, with the proposed task, named Audio-Corsi for brevity. Our results show comparable performance across the two tests in terms of the estimated memory parameter precision. Furthermore, in the Audio-Corsi we observe a lower span compared to the Corsi-Block test. We discuss these results in the context of the theoretical relationship between the auditory and visual sensory modalities and potential applications of this system in multiple scientific and clinical contexts.
Kim C, Lim V, Picinali L, 2020, Investigation into consistency of subjective and objective perceptual selection of non-individual head-related transfer functions, Journal of the Audio Engineering Society, Vol: 68, Pages: 819-831, ISSN: 1549-4950
The binaural technique uses a set of direction-dependent filters known as Head-Related Transfer Functions (HRTFs) in order to create 3D soundscapes through a pair of headphones. Although each HRTF is unique to the person it ismeasured from, due to the cost and complexity of the measurement process pre-measured non-individual HRTFs are generally used. This study investigates whether it is possible for a listener to perceptually select the best-fitting non-individual HRTFs in a consistent manner, using both subjective and objective methods. 16 subjects participated in 3 repeated sessions of binaural listening tests. During each session, participants firstly listened tomoving sound sources spatialized using 7 different non-individual HRTFs and ranked them according to perceived plausibility and externalization (subjective selection). They then performed a localization task with sources spatialized using the same HRTFs (objective selection). In the subjective selection, 3 to 9 participants showed test-retest reliability levels that could be regarded as good or excellent depending on the attribute under question, the source type, and the trajectory. The reliability was better for participants with musical training and critical audio listening experience. In the objective selection, it was not possible to find significant differences between the tested HRTFs based on localization-related performances.
Sethi S, Ewers R, Jones N, et al., 2020, SAFE Acoustics: an open-source, real-time eco-acoustic monitoring network in the tropical rainforests of Borneo, Methods in Ecology and Evolution, Vol: 11, Pages: 1182-1185, ISSN: 2041-210X
1. Automated monitoring approaches offer an avenue to unlocking large‐scale insight into how ecosystems respond to human pressures. However, since data collection and data analyses are often treated independently, there are currently no open‐source examples of end‐to‐end, real‐time ecological monitoring networks. 2. Here, we present the complete implementation of an autonomous acoustic monitoring network deployed in the tropical rainforests of Borneo. Real‐time audio is uploaded remotely from the field, indexed by a central database, and delivered via an API to a public‐facing website.3. We provide the open‐source code and design of our monitoring devices, the central web2py database, and the ReactJS website. Furthermore, we demonstrate an extension of this infrastructure to deliver real‐time analyses of the eco‐acoustic data. 4. By detailing a fully functional, open source, and extensively tested design, our work will accelerate the rate at which fully autonomous monitoring networks mature from technological curiosities, and towards genuinely impactful tools in ecology.
Frost E, Porat T, Malhotra P, et al., 2020, Collaborative design of a gamified application for auditory-cognitive training, JMIR Human Factors, Vol: 7, ISSN: 2292-9495
Background:Multiple gaming applications under the dementia umbrella for skills such as navigation exist, but there has yet to be an application designed specifically to investigate the role hearing loss may have in the process of cognitive decline. There is a demonstrable gap in utilising serious games to further the knowledge of the potential relationship between hearing loss and dementia.Objective:The aim of this study was to identify the needs, facilitators and barriers in designing a novel auditory-cognitive training gaming application.Methods:A participatory design approach was used to engage key stakeholders across audiology and cognitive disorders specialisms. Two rounds, including paired semi-structured interviews and focus groups were completed and thematically analysed.Results:18 stakeholders participated in total and 6 themes were identified to inform the next stage of the application’s development.Conclusions:The findings can now be implemented into the development of the beta-version of the application. The application will be evaluated against outcome measures of speech listening in noise, cognitive and attentional tasks, quality of life and usability.
Sethi SS, Ewers RM, Jones NS, et al., 2020, Soundscapes predict species occurrence in tropical forests, Publisher: Cold Spring Harbor Laboratory
Accurate occurrence data is necessary for the conservation of keystone or endangered species, but acquiring it is usually slow, laborious, and costly. Automated acoustic monitoring offers a scalable alternative to manual surveys, but identifying species vocalisations requires large manually annotated training datasets, and is not always possible (e.g., for silent species). A new, intermediate approach is needed that rapidly predicts species occurrence without requiring extensive labelled data.We investigated whether local soundscapes could be used to infer the presence of 32 avifaunal and seven herpetofaunal species across a tropical forest degradation gradient in Sabah, Malaysia. We developed a machine-learning based approach to characterise species indicative soundscapes, training our models on a coarsely labelled manual point-count dataset.Soundscapes successfully predicted the occurrence of 34 out of the 39 species across the two taxonomic groups, with area under the curve (AUC) metrics of up to 0.87 (Bold-striped Tit-babbler Macronus bornensis). The highest accuracies were achieved for common species with strong temporal occurrence patterns.Soundscapes were a better predictor of species occurrence than above-ground biomass – a metric often used to quantify habitat quality across forest degradation gradients.Synthesis and applications: Our results demonstrate that soundscapes can be used to efficiently predict the occurrence of a wide variety of species. This provides a new direction for audio data to deliver large-scale, accurate assessments of habitat suitability using cheap and easily obtained field datasets.
Comunità M, Gerino A, Lim V, et al., 2020, PlugSonic: a web- and mobile-based platform for binaural audio and sonic narratives, Publisher: arXiv
PlugSonic is a suite of web- and mobile-based applications for the curationand experience of binaural interactive soundscapes and sonic narratives. It wasdeveloped as part of the PLUGGY EU project (Pluggable Social Platform forHeritage Awareness and Participation) and consists of two main applications:PlugSonic Sample, to edit and apply audio effects, and PlugSonic Soundscape, tocreate and experience binaural soundscapes. The audio processing withinPlugSonic is based on the Web Audio API and the 3D Tune-In Toolkit, while theexploration of soundscapes in a physical space is obtained using Apple's ARKit.In this paper we present the design choices, the user involvement processes andthe implementation details. The main goal of PlugSonic is technologydemocratisation; PlugSonic users - whether institutions or citizens - are allgiven the instruments needed to create, process and experience 3D soundscapesand sonic narrative; without the need for specific devices, external tools(software and/or hardware), specialised knowledge or custom development. Theevaluation, which was conducted with inexperienced users on three tasks -creation, curation and experience - demonstrates how PlugSonic is indeed asimple, effective, yet powerful tool.
Sethi S, Jones NS, Fulcher B, et al., 2020, Characterising soundscapes across diverse ecosystems using a universal acoustic feature set, Proceedings of the National Academy of Sciences of USA, Vol: 117, Pages: 17049-17055, ISSN: 0027-8424
Natural habitats are being impacted by human pressures at an alarming rate. Monitoring these ecosystem-level changes often requires labor-intensive surveys that are unable to detect rapid or unanticipated environmental changes. Here we have developed a generalizable, data-driven solution to this challenge using eco-acoustic data. We exploited a convolutional neural network to embed soundscapes from a variety of ecosystems into a common acoustic space. In both supervised and unsupervised modes, this allowed us to accurately quantify variation in habitat quality across space and in biodiversity through time. On the scale of seconds, we learned a typical soundscape model that allowed automatic identification of anomalous sounds in playback experiments, providing a potential route for real-time automated detection of irregular environmental behavior including illegal logging and hunting. Our highly generalizable approach, and the common set of features, will enable scientists to unlock previously hidden insights from acoustic data and offers promise as a backbone technology for global collaborative autonomous ecosystem monitoring efforts.
Vijayasingam A, Frost E, Wilkins J, et al., 2020, Tablet and web-based audiometry to screen for hearing loss in adults with cystic fibrosis, Thorax, Vol: 75, Pages: 632-639, ISSN: 0040-6376
INTRODUCTION: Individuals with chronic lung disease (eg, cystic fibrosis (CF)) often receive antimicrobial therapy including aminoglycosides resulting in ototoxicity. Extended high-frequency audiometry has increased sensitivity for ototoxicity detection, but diagnostic audiometry in a sound-booth is costly, time-consuming and requires a trained audiologist. This cross-sectional study analysed tablet-based audiometry (Shoebox MD) performed by non-audiologists in an outpatient setting, alongside home web-based audiometry (3D Tune-In) to screen for hearing loss in adults with CF. METHODS: Hearing was analysed in 126 CF adults using validated questionnaires, a web self-hearing test (0.5 to 4 kHz), tablet (0.25 to 12 kHz) and sound-booth audiometry (0.25 to 12 kHz). A threshold of ≥25 dB hearing loss at ≥1 audiometric frequency was considered abnormal. Demographics and mitochondrial DNA sequencing were used to analyse risk factors, and accuracy and usability of hearing tests determined. RESULTS: Prevalence of hearing loss within any frequency band tested was 48%. Multivariate analysis showed age (OR 1.127; (95% CI: 1.07 to 1.18; p value<0.0001) per year older) and total intravenous antibiotic days over 10 years (OR 1.006; (95% CI: 1.002 to 1.010; p value=0.004) per further intravenous day) were significantly associated with increased risk of hearing loss. Tablet audiometry had good usability, was 93% sensitive, 88% specific with 94% negative predictive value to screen for hearing loss compared with web self-test audiometry and questionnaires which had poor sensitivity (17% and 13%, respectively). Intraclass correlation (ICC) of tablet versus sound-booth audiometry showed high correlation (ICC >0.9) at all frequencies ≥4 kHz. CONCLUSIONS: Adults with CF have a high prevalence of drug-related hearing loss and tablet-based audiometry can be a practical, accurate screening tool within integrated ototoxicity monitoring programmes for early detection.
Griffin E, Picinali L, Scase M, 2020, The effectiveness of an interactive audio‐tactile map for the process of cognitive mapping and recall among people with visual impairments, Brain and Behavior, Vol: 10, ISSN: 2162-3279
BackgroundPeople with visual impairments can experience numerous challenges navigating unfamiliar environments. Systems that operate as prenavigation tools can assist such individuals. This mixed‐methods study examined the effectiveness of an interactive audio‐tactile map tool on the process of cognitive mapping and recall, among people who were blind or had visual impairments. The tool was developed with the involvement of visually impaired individuals who additionally provided further feedback throughout this research.MethodsA mixed‐methods experimental design was employed. Fourteen participants were allocated to either an experimental group who were exposed to an audio‐tactile map, or a control group exposed to a verbally annotated tactile map. After five minutes’ exposure, multiple‐choice questions examined participants’ recall of the spatial and navigational content. Subsequent semi‐structured interviews were conducted to examine their views surrounding the study and the product.ResultsThe experimental condition had significantly better overall recall than the control group and higher average scores in all four areas examined by the questions. The interviews suggested that the interactive component offered individuals the freedom to learn the map in several ways and did not restrict them to a sequential and linear approach to learning.ConclusionAssistive technology can reduce challenges faced by people with visual impairments, and the flexible learning approach offered by the audio‐tactile map may be of particular value. Future researchers and assistive technology developers may wish to explore this further.
Sethi S, Jones N, Fulcher B, et al., 2019, Combining machine learning and a universal acoustic feature-set yields efficient automated monitoring of ecosystems, Publisher: bioRxiv
Natural habitats are being impacted by human pressures at an alarming rate. Monitoring these ecosystem-level changes often requires labour-intensive surveys that are unable to detect rapid or unanticipated environmental changes. Here we developed a generalisable, data-driven solution to this challenge using eco-acoustic data. We exploited a convolutional neural network to embed ecosystem soundscapes from a wide variety of biomes into a common acoustic space. In both supervised and unsupervised modes, this allowed us to accurately quantify variation in habitat quality across space and in biodiversity through time. On the scale of seconds, we learned a typical soundscape model that allowed automatic identification of anomalous sounds in playback experiments, paving the way for real-time detection of irregular environmental behaviour including illegal activity. Our highly generalisable approach, and the common set of features, will enable scientists to unlock previously hidden insights from eco-acoustic data and offers promise as a backbone technology for global collaborative autonomous ecosystem monitoring efforts.
Steadman M, Kim C, Lestang J-H, et al., 2019, Short-term effects of sound localization training in virtual reality, Scientific Reports, Vol: 9, ISSN: 2045-2322
Head-related transfer functions (HRTFs) capture the direction-dependant way that sound interacts with the head and torso. In virtual audio systems, which aim to emulate these effects, non-individualized, generic HRTFs are typically used leading to an inaccurate perception of virtual sound location. Training has the potential to exploit the brain’s ability to adapt to these unfamiliar cues. In this study, three virtual sound localization training paradigms were evaluated; one provided simple visual positional confirmation of sound source location, a second introduced game design elements (“gamification”) and a final version additionally utilized head-tracking to provide listeners with experience of relative sound source motion (“active listening”). The results demonstrate a significant effect of training after a small number of short (12-minute) training sessions, which is retained across multiple days. Gamification alone had no significant effect on the efficacy of the training, but active listening resulted in a significantly greater improvements in localization accuracy. In general, improvements in virtual sound localization following training generalized to a second set of non-individualized HRTFs, although some HRTF-specific changes were observed in polar angle judgement for the active listening group. The implications of this on the putative mechanisms of the adaptation process are discussed.
Steadman MA, Kim C, Lestang J-H, et al., 2019, Short-term effects of sound localization training in virtual reality, Publisher: biorxiv
Head-related transfer functions (HRTFs) capture the direction-dependant way that sound interacts with the head and torso. In virtual audio systems, which aim to emulate these effects, non-individualized, generic HRTFs are typically used leading to an inaccurate perception of virtual sound location. Training has the potential to exploit the brain's ability to adapt to these unfamiliar cues. In this study, three virtual sound localization training paradigms were evaluated; one provided simple visual positional confirmation of sound source location, a second introduced game design elements ("gamification") and a final version additionally utilized head-tracking to provide listeners with experience of relative sound source motion ("active listening"). The results demonstrate a significant effect of training after a small number of short (12-minute) training sessions, which is retained across multiple days. Gamification alone had no significant effect on the efficacy of the training, but active listening resulted in a significantly greater improvements in localization accuracy. In general, improvements in virtual sound localization following training generalized to a second set of non-individualized HRTFs, although some HRTF-specific changes were observed in polar angle judgement for the active listening group. The implications of this on the putative mechanisms of the adaptation process are discussed.
Engel Alonso-Martinez I, Henry C, Amengual Gari SV, et al., 2019, Perceptual comparison of ambisonics-based reverberation methods in binaural listening, EAA Spatial Audio Signal Processing Symposium
Vijayasingam A, Frost E, Wilkins J, et al., 2019, S140 Interim results from a prospective study of tablet and web-based audiometry to detect ototoxicity in adults with cystic fibrosis (vol 73, pg A87, 2018), THORAX, Vol: 74, Pages: 723-723, ISSN: 0040-6376
Comunità M, Gerino A, Lim V, et al., 2019, Web-based binaural audio and sonic narratives for cultural heritage, Conference on Immersive and Interactive Audio, Publisher: Audio Engineering Society
This paper introduces PlugSonic Soundscape and PlugSonic Sample, two web-based applications for the creation and experience of binaural interactive audio narratives and soundscapes. The apps are being developed as part of the PLUGGY EU project (Pluggable Social Platform for Heritage Awareness and Participation). The apps audio processing is based on the Web Audio API and the 3D Tune-In toolkit. Within the paper, we report on the implementation, evaluation and future developments. We believe that the idea of a web-based application for 3D sonic narratives represents a novel contribution to the cultural heritage, digital storytelling and 3D audio technology domains.
Picinali L, Hrafnkelsson R, Reyes-Lecuona A, 2019, The 3D tune-in toolkit VST binaural audio plugin, 2019 AES International Conference on Immersive and Interactive Audio
© 2019 Audio Engineering Society. All rights reserved. This demo paper aims at introducing a novel VST binaural audio plugin based on the 3D Tune-In (3DTI) Toolkit, a multiplatform open-source C++ library which includes several functionalities for headphone-based sound spatialisation, together with generalised hearing aid and hearing loss simulators. The 3DTI Toolkit VST plugin integrates all the binaural spatialisation functionalities of the 3DTI Toolkit for one single audio source, which can be positioned and moved around the listener. The spatialisation is based on direct convolution with any user-imported Head Related Transfer Function (HRTF) set. Interaural Time Differences (ITDs) are customised in real-time according to the listener’s head circumference. Binaural reverberation is performed using a virtual-loudspeakers Ambisonic approach and convolution with user-imported Binaural Room Impulse Responses (BRIRs). Additional processes for near- and far-field sound sources simulations are also included.
Cuevas-Rodríguez M, Picinali L, González-Toledo D, et al., 2019, 3D Tune-In Toolkit: An open-source library for real-time binaural spatialisation, PLoS ONE, Vol: 14, ISSN: 1932-6203
The 3D Tune-In Toolkit (3DTI Toolkit) is an open-source standard C++ library which includes a binaural spatialiser. This paper presents the technical details of this renderer, outlining its architecture and describing the processes implemented in each of its components. In order to put this description into context, the basic concepts behind binaural spatialisation are reviewed through a chronology of research milestones in the field in the last 40 years. The 3DTI Toolkit renders the anechoic signal path by convolving sound sources with Head Related Impulse Responses (HRIRs), obtained by interpolating those extracted from a set that can be loaded from any file in a standard audio format. Interaural time differences are managed separately, in order to be able to customise the rendering according the head size of the listener, and to reduce comb-filtering when interpolating between different HRIRs. In addition, geometrical and frequency-dependent corrections for simulating near-field sources are included. Reverberation is computed separately using a virtual loudspeakers Ambisonic approach and convolution with Binaural Room Impulse Responses (BRIRs). In all these processes, special care has been put in avoiding audible artefacts produced by changes in gains and audio filters due to the movements of sources and of the listener. The 3DTI Toolkit performance, as well as some other relevant metrics such as non-linear distortion, are assessed and presented, followed by a comparison between the features offered by the 3DTI Toolkit and those found in other currently available open- and closed-source binaural renderers.
Stitt P, Picinali L, Katz BFG, 2019, Auditory accommodation to poorly matched non-individual spectral localization cues through active learning, Scientific Reports, Vol: 9, Pages: 1-14, ISSN: 2045-2322
This study examines the effect of adaptation to non-ideal auditory localization cues represented by the Head-Related Transfer Function (HRTF) and the retention of training for up to three months after the last session. Continuing from a previous study on rapid non-individual HRTF learning, subjects using non-individual HRTFs were tested alongside control subjects using their own measured HRTFs. Perceptually worst-rated non-individual HRTFs were chosen to represent the worst-case scenario in practice and to allow for maximum potential for improvement. The methodology consisted of a training game and a localization test to evaluate performance carried out over 10 sessions. Sessions 1–4 occurred at 1 week intervals, performed by all subjects. During initial sessions, subjects showed improvement in localization performance for polar error. Following this, half of the subjects stopped the training game element, continuing with only the localization task. The group that continued to train showed improvement, with 3 of 8 subjects achieving group mean polar errors comparable to the control group. The majority of the group that stopped the training game retained their performance attained at the end of session 4. In general, adaptation was found to be quite subject dependent, highlighting the limits of HRTF adaptation in the case of poor HRTF matches. No identifier to predict learning ability was observed.
Engel Alonso-Martinez I, Goodman D, Picinali L, 2019, The Effect of Auditory Anchors on Sound Localization: A Preliminary Study, 2019 AES International Conference on Immersive and Interactive Audio
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.