88 results found
Schüller A, Schilling A, Krauss P, et al., 2023, Attentional Modulation of the Cortical Contribution to the Frequency-Following Response Evoked by Continuous Speech., J Neurosci, Vol: 43, Pages: 7429-7440
Selective attention to one of several competing speakers is required for comprehending a target speaker among other voices and for successful communication with them. It moreover has been found to involve the neural tracking of low-frequency speech rhythms in the auditory cortex. Effects of selective attention have also been found in subcortical neural activities, in particular regarding the frequency-following response related to the fundamental frequency of speech (speech-FFR). Recent investigations have, however, shown that the speech-FFR contains cortical contributions as well. It remains unclear whether these are also modulated by selective attention. Here we used magnetoencephalography to assess the attentional modulation of the cortical contributions to the speech-FFR. We presented both male and female participants with two competing speech signals and analyzed the cortical responses during attentional switching between the two speakers. Our findings revealed robust attentional modulation of the cortical contribution to the speech-FFR: the neural responses were higher when the speaker was attended than when they were ignored. We also found that, regardless of attention, a voice with a lower fundamental frequency elicited a larger cortical contribution to the speech-FFR than a voice with a higher fundamental frequency. Our results show that the attentional modulation of the speech-FFR does not only occur subcortically but extends to the auditory cortex as well.SIGNIFICANCE STATEMENT Understanding speech in noise requires attention to a target speaker. One of the speech features that a listener can use to identify a target voice among others and attend it is the fundamental frequency, together with its higher harmonics. The fundamental frequency arises from the opening and closing of the vocal folds and is tracked by high-frequency neural activity in the auditory brainstem and in the cortex. Previous investigations showed that the subcortical neural tracking is
Varano E, Guilleminot P, Reichenbach T, 2023, AVbook, a high-frame-rate corpus of narrative audiovisual speech for investigating multimodal speech perception, JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, Vol: 153, Pages: 3130-3137, ISSN: 0001-4966
Thornton M, Mandic D, Reichenbach T, 2022, Robust decoding of the speech envelope from EEG recordings through deep neural networks, JOURNAL OF NEURAL ENGINEERING, Vol: 19, ISSN: 1741-2560
Kegler M, Weissbart H, Reichenbach T, 2022, The neural response at the fundamental frequency of speech is modulated by word-level acoustic and linguistic information, FRONTIERS IN NEUROSCIENCE, Vol: 16
Kegler M, Weissbart H, Reichenbach T, 2022, The neural response at the fundamental frequency of speech is modulated by word-level acoustic and linguistic information, Publisher: BioArxiv
Spoken language comprehension requires rapid and continuous integration of information, from lower-level acoustic to higher-level linguistic features. Much of this processing occurs in the cerebral cortex. Its neural activity exhibits, for instance, correlates of predictive processing, emerging at delays of a few hundred milliseconds. However, the auditory pathways are also characterized by extensive feedback loops from higher-level cortical areas to lower-level ones as well as to subcortical structures. Early neural activity can therefore be influenced by higher-level cognitive processes, but it remains unclear whether such feedback contributes to linguistic processing. Here, we investigated early speech-evoked neural activity that emerges at the fundamental frequency. We analyzed EEG recordings obtained when subjects listened to a story read by a single speaker. We identified a response tracking the speaker’s fundamental frequency that occurred at a delay of 11 ms, while another response elicited by the high-frequency modulation of the envelope of higher harmonics exhibited a larger magnitude and longer latency of about 18 ms. Subsequently, we determined the magnitude of these early neural responses for each individual word in the story. We then quantified the context-independent frequency of each word and used a language model to compute context-dependent word surprisal and precision. The word surprisal represented how predictable a word is, given the previous context, and the word precision reflected the confidence about predicting the next word from the past context. We found that the word-level neural responses at the fundamental frequency were predominantly influenced by the acoustic features: the average fundamental frequency and its variability. Amongst the linguistic features, only context-independent word frequency showed a weak but significant modulation of the neural response to the high-frequency envelope modulation. Our results show that the ear
Guilleminot P, Reichenbach T, 2022, Enhancement of speech-in-noise comprehension through vibrotactile stimulation at the syllabic rate, PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, Vol: 119, ISSN: 0027-8424
Wairagkar M, Lima MR, Bazo D, et al., 2022, Emotive response to a hybrid-face robot and translation to consumer social robots, IEEE Internet of Things Journal, Vol: 9, Pages: 3174-3188, ISSN: 2327-4662
We present the conceptual formulation, design, fabrication, control and commercial translation of an IoT enabled social robot as mapped through validation of human emotional response to its affective interactions. The robot design centres on a humanoid hybrid-face that integrates a rigid faceplate with a digital display to simplify conveyance of complex facial movements while providing the impression of three-dimensional depth. We map the emotions of the robot to specific facial feature parameters, characterise recognisability of archetypical facial expressions, and introduce pupil dilation as an additional degree of freedom for emotion conveyance. Human interaction experiments demonstrate the ability to effectively convey emotion from the hybrid-robot face to humans. Conveyance is quantified by studying neurophysiological electroencephalography (EEG) response to perceived emotional information as well as through qualitative interviews. Results demonstrate core hybrid-face robotic expressions can be discriminated by humans (80%+ recognition) and invoke face-sensitive neurophysiological event-related potentials such as N170 and Vertex Positive Potentials in EEG. The hybrid-face robot concept has been modified, implemented, and released by Emotix Inc in the commercial IoT robotic platform Miko (‘My Companion’), an affective robot currently in use for human-robot interaction with children. We demonstrate that human EEG responses to Miko emotions are comparative to that of the hybrid-face robot validating design modifications implemented for large scale distribution. Finally, interviews show above 90% expression recognition rates in our commercial robot. We conclude that simplified hybrid-face abstraction conveys emotions effectively and enhances human-robot interaction.
Etard O, Ben Messaoud R, Gaugain G, et al., 2022, No Evidence of Attentional Modulation of the Neural Response to the Temporal Fine Structure of Continuous Musical Pieces, JOURNAL OF COGNITIVE NEUROSCIENCE, Vol: 34, Pages: 411-424, ISSN: 0898-929X
Varano E, Vougioukas K, Ma P, et al., 2022, Speech-driven facial animations improve speech-in-noise comprehension of humans, Frontiers in Neuroscience, ISSN: 1662-453X
Understanding speech becomes a demanding task when the environment is noisy. Comprehension of speech in noise can be substantially improved by looking at the speaker’s face, and this audiovisual benefit is even more pronounced in people with hearing impairment. Recent advances in AI have allowed to synthesize photorealistic talking faces from a speech recording and a still image of a person’s face in an end-to-end manner. However, it has remained unknown whether such facial animations improve speech-in-noise comprehension. Here we consider facial animations produced by a recently introduced generative adversarial network (GAN), and show that humans cannot distinguish between the synthesized and the natural videos. Importantly, we then show that the end-to-end synthesized videos significantly aid humans in understanding speech in noise, although the natural facial motions yield a yet higher audiovisual benefit. We further find that an audiovisual speech recognizer (AVSR) benefits from the synthesized facial animations as well. Our results suggest that synthesizing facial motions from speech can be used to aid speech comprehension in difficult listening environments.
Varano E, Vougioukas K, Ma P, et al., 2022, Speech-Driven Facial Animations Improve Speech-in-Noise Comprehension of Humans, FRONTIERS IN NEUROSCIENCE, Vol: 15
Kulkarni A, Kegler M, Reichenbach T, 2021, Effect of visual input on syllable parsing in a computational model of a neural microcircuit for speech processing., Journal of Neural Engineering, Vol: 5, Pages: 1-14, ISSN: 1741-2552
Seeing a person talking can help to understand them, in particular in a noisy environment. However, how the brain integrates the visual information with the auditory signal to enhance speech comprehension remains poorly understood. Here we address this question in a computational model of a cortical microcircuit for speech processing. The model consists of an excitatory and an inhibitory neural population that together create oscillations in the theta frequency range. When simulated with speech, the theta rhythm becomes entrained to the onsets of syllables, such that the onsets can be inferred from the network activity. We investigate how well the obtained syllable parsing performs when different types of visual stimuli are added. In particular, we consider currents related to the rate of syllables as well as currents related to the mouth-opening area of the talking faces. We find that currents that target the excitatory neuronal population can influence speech comprehension, both boosting it or impeding it, depending on the temporal delay and on whether the currents are excitatory or inhibitory. In contrast, currents that act on the inhibitory neurons do not impact speech comprehension significantly. Our results suggest neural mechanisms for the integration of visual information with the acoustic information in speech and make experimentally-testable predictions.
Keshavarzi M, Reichenbach T, Moore BCJ, 2021, Transient Noise Reduction Using a Deep Recurrent Neural Network: Effects on Subjective Speech Intelligibility and Listening Comfort, TRENDS IN HEARING, Vol: 25, ISSN: 2331-2165
Keshavarzi M, Varano E, Reichenbach J, 2021, Cortical tracking of a background speaker modulates the comprehension of a foreground speech signal, The Journal of Neuroscience, Vol: 41, Pages: 5093-5101, ISSN: 0270-6474
Understanding speech in background noise is a difficult task. The tracking of speech rhythms such as the rate of syllables and words by cortical activity has emerged as a key neural mechanism for speech in-noise comprehension. In particular, recent investigations have used transcranial alternating current stimulation (tACS) with the envelope of a speech signal to influence the cortical speech tracking, demonstrating that this type of stimulation modulates comprehension and therefore evidencing a functional role of the cortical tracking in speech processing. Cortical activity has been found to track the rhythms of a background speaker as well, but the functional significance of this neural response remains unclear. Here we employ a speech-comprehension task with a target speaker in the presence of a distractor voice to show that tACS with the speech envelope of the target voice as well as tACS with the envelope of the distractor speaker both modulate the comprehension of the target speech. Because the envelope of the distractor speech does not carry information about the target speech stream, the modulation of speech comprehension through tACS with this envelope evidences that the cortical tracking of the background speaker affects the comprehension of the foreground speech signal. The phase dependency of the resulting modulation of speech comprehension is, however, opposite to that obtained from tACS with the envelope of the target speech signal. This suggests that the cortical tracking of the ignored speech stream and that of the attended speech stream may compete for neural resources.
Saiz Alia M, Miller P, Reichenbach J, 2021, Otoacoustic emissions evoked by the time-varying harmonic structure of speech, eNeuro, Vol: 8, Pages: 1-12, ISSN: 2373-2822
The human auditory system is exceptional at comprehending an individual speaker even in complex acoustic environments. Because the inner ear, or cochlea, possesses an active mechanism that can be controlled by subsequent neural processing centers through descending nerve fibers, it may already contribute to speech processing. The cochlear activity can be assessed by recording otoacoustic emissions (OAEs), but employing these emissions to assess speech processing in the cochlea is obstructed by the complexity of natural speech. Here, we develop a novel methodology to measure OAEs that are related to the time-varying harmonic structure of speech [speech-distortion-product OAEs (DPOAEs)]. We then employ the method to investigate the effect of selective attention on the speech-DPOAEs. We provide tentative evidence that the speech-DPOAEs are larger when the corresponding speech signal is attended than when it is ignored. Our development of speech-DPOAEs opens up a path to further investigations of the contribution of the cochlea to the processing of complex real-world signals.
Etard O, Messaoud RB, Gaugain G, et al., 2021, The neural response to the temporal fine structure of continuous musical pieces is not affected by selective attention
<jats:title>Abstract</jats:title><jats:p>Speech and music are spectro-temporally complex acoustic signals that a highly relevant for humans. Both contain a temporal fine structure that is encoded in the neural responses of subcortical and cortical processing centres. The subcortical response to the temporal fine structure of speech has recently been shown to be modulated by selective attention to one of two competing voices. Music similarly often consists of several simultaneous melodic lines, and a listener can selectively attend to a particular one at a time. However, the neural mechanisms that enable such selective attention remain largely enigmatic, not least since most investigations to date have focussed on short and simplified musical stimuli. Here we study the neural encoding of classical musical pieces in human volunteers, using scalp electroencephalography (EEG) recordings. We presented volunteers with continuous musical pieces composed of one or two instruments. In the latter case, the participants were asked to selectively attend to one of the two competing instruments and to perform a vibrato identification task. We used linear encoding and decoding models to relate the recorded EEG activity to the stimulus waveform. We show that we can measure neural responses to the temporal fine structure of melodic lines played by one single instrument, at the population level as well as for most individual subjects. The neural response peaks at a latency of 7.6 ms and is not measurable past 15 ms. When analysing the neural responses elicited by competing instruments, we find no evidence of attentional modulation. Our results show that, much like speech, the temporal fine structure of music is tracked by neural activity. In contrast to speech, however, this response appears unaffected by selective attention in the context of our experiment.</jats:p>
Sumner L, Mestel A, Reichenbach J, 2021, Steady streaming as a method for drug delivery tothe inner ear, Scientific Reports, Vol: 11, Pages: 1-12, ISSN: 2045-2322
The inner ear, or cochlea, is a fluid-filled organ housing the mechanosensitive hair cells. Sound stimulation is relayed to the hair cells through waves that propagate on the elastic basilar membrane. Sensorineural hearing loss occurs from damage to the hair cells and cannot currently be cured. Although drugs have been proposed to prevent damage or restore functionality to hair cells, a difficulty with such treatments is ensuring adequate drug delivery to the cells. Because the cochlea is encased in the temporal bone, it can only be accessed from its basal end. However, the hair cells that are responsible for detecting speech-frequency sounds reside at the opposite, apical end. In this paper we show that steady streaming can be used to transport drugs along the cochlea. Steady streaming is a nonlinear process that accompanies many fluctuating fluid motions, including the sound-evoked waves in the inner ear. We combine an analytical approximation for the waves in the cochlea with computational fluid dynamic simulations to demonstrate that the combined steady streaming effects of several different frequencies can transport drugs from the base of the cochlea further towards the apex. Our results therefore show that multi-frequency sound stimulation can serve as a non-invasive method to transport drugs efficiently along the cochlea.
Kegler M, Reichenbach J, 2021, Modelling the effects of transcranial alternating current stimulation on the neural encoding of speech in noise, NeuroImage, Vol: 224, ISSN: 1053-8119
Transcranial alternating current stimulation (tACS) can non-invasively modulate neuronal activity in the cerebral cortex, in particular at the frequency of the applied stimulation. Such modulation can matter for speech processing, since the latter involves the tracking of slow amplitude fluctuations in speech by cortical activity. tACS with a current signal that follows the envelope of a speech stimulus has indeed been found to influence the cortical tracking and to modulate the comprehension of the speech in background noise. However, how exactly tACS influences the speech-related cortical activity, and how it causes the observed effects on speech comprehension, remains poorly understood. A computational model for cortical speech processing in a biophysically plausible spiking neural network has recently been proposed. Here we extended the model to investigate the effects of different types of stimulation waveforms, similar to those previously applied in experimental studies, on the processing of speech in noise. We assessed in particular how well speech could be decoded from the neural network activity when paired with the exogenous stimulation. We found that, in the absence of current stimulation, the speech-in-noise decoding accuracy was comparable to the comprehension of speech in background noise of human listeners. We further found that current stimulation could alter the speech decoding accuracy by a few percent, comparable to the effects of tACS on speech-in-noise comprehension. Our simulations further allowed us to identify the parameters for the stimulation waveforms that yielded the largest enhancement of speech-in-noise encoding. Our model thereby provides insight into the potential neural mechanisms by which weak alternating current stimulation may influence speech comprehension and allows to screen a large range of stimulation waveforms for their effect on speech processing.
Saiz-Alia M, Reichenbach T, 2020, Computational modeling of the auditory brainstem response to continuous speech., Journal of Neural Engineering, Vol: 17, Pages: 1-12, ISSN: 1741-2552
OBJECTIVE: The auditory brainstem response can be recorded non-invasively from scalp electrodes and serves as an important clinical measure of hearing function. We have recently shown how the brainstem response at the fundamental frequency of continuous, non-repetitive speech can be measured, and have used this measure to demonstrate that the response is modulated by selective attention. However, different parts of the speech signal as well as several parts of the brainstem contribute to this response. Here we employ a computational model of the brainstem to elucidate the influence of these different factors. APPROACH: We developed a computational model of the auditory brainstem by combining a model of the middle and inner ear with a model of globular bushy cells in the cochlear nuclei and with a phenomenological model of the inferior colliculus. We then employed the model to investigate the neural response to continuous speech at different stages in the brainstem, following the methodology developed recently by ourselves for detecting the brainstem response to running speech from scalp recordings. We compared the simulations with recordings from healthy volunteers. MAIN RESULTS: We found that the auditory-nerve fibers, the cochlear nuclei and the inferior colliculus all contributed to the speech-evoked brainstem response, although the dominant contribution came from the inferior colliculus. The delay of the response corresponded to that observed in experiments. We further found that a broad range of harmonics of the fundamental frequency, up to about 8 kHz, contributed to the brainstem response. The response declined with increasing fundamental frequency, although the signal-to-noise ratio was largely unaffected. SIGNIFICANCE: Our results suggest that the scalp-recorded brainstem response at the fundamental frequency of speech originates predominantly in the inferior colliculus. They further show that the response is shaped by a large number of higher harmonics of
Reichenbach J, Keshavarzi M, 2020, Transcranial alternating current stimulation with the theta-band portion of the temporally-aligned speech envelope improves speech-in-noise comprehension, Frontiers in Human Neuroscience, Vol: 14, Pages: 1-8, ISSN: 1662-5161
Transcranial alternating current stimulation with the speech envelope can modulate the comprehension of speech in noise. The modulation stems from the theta- but not the delta-band portion of the speech envelope, and likely reflects the entrainment of neural activity in the theta frequency band, which may aid the parsing of the speech stream. The influence of the current stimulation on speech comprehension can vary with the time delay between the current waveform and the audio signal. While this effect has been investigated for current stimulation based on the entire speech envelope, it has not yet been measured when the current waveform follows the theta-band portion of the speech envelope. Here, we show that transcranial current stimulation with the speech envelope filtered in the theta frequency band improves speech comprehension as compared to a sham stimulus. The improvement occurs when there is no time delay between the current and the speech stimulus, as well as when the temporal delay is comparatively short, 90 ms. In contrast, longer delays, as well as negative delays, do not impact speech-in-noise comprehension. Moreover, we find that the improvement of speech comprehension at no or small delays of the current stimulation is consistent across participants. Our findings suggest that cortical entrainment to speech is most influenced through current stimulation that follows the speech envelope with at most a small delay. They also open a path to enhancing the perception of speech in noise, an issue that is particularly important for people with hearing impairment.
Ota T, Nin F, Choi S, et al., 2020, Characterisation of the static offset in the travelling wave in the cochlear basal turn, Pflügers Archiv European Journal of Physiology, Vol: 472, Pages: 625-635, ISSN: 0031-6768
In mammals, audition is triggered by travelling waves that are evoked by acoustic stimuli in the cochlear partition, a structure containing sensory hair cells and a basilar membrane. When the cochlea is stimulated by a pure tone of low frequency, a static offset occurs in the vibration in the apical turn. In the high-frequency region at the cochlear base, multi-tone stimuli induce a quadratic distortion product in the vibrations that suggests the presence of an offset. However, vibrations below 100 Hz, including a static offset, have not been directly measured there. We therefore constructed an interferometer for detecting motion at low frequencies including 0 Hz. We applied the interferometer to record vibrations from the cochlear base of guinea pigs in response to pure tones. When the animals were exposed to sound at an intensity of 70 dB or higher, we recorded a static offset of the sinusoidally vibrating cochlear partition by more than 1 nm towards the scala vestibuli. The offset’s magnitude grew monotonically as the stimuli intensified. When stimulus frequency was varied, the response peaked around the best frequency, the frequency that maximised the vibration amplitude at threshold sound pressure. These characteristics are consistent with those found in the low-frequency region and are therefore likely common across the cochlea. The offset diminished markedly when the somatic motility of mechanosensitive outer hair cells, the force-generating machinery that amplifies the sinusoidal vibrations, was pharmacologically blocked. Therefore, the partition offset appears to be linked to the electromotile contraction of outer hair cells.
Keshavarzi M, Kegler M, Kadir S, et al., 2020, Transcranial alternating current stimulation in the theta band but not in the delta band modulates the comprehension of naturalistic speech in noise, NeuroImage, Vol: 210, ISSN: 1053-8119
Auditory cortical activity entrains to speech rhythms and has been proposed as a mechanism for online speech processing. In particular, neural activity in the theta frequency band (4–8 Hz) tracks the onset of syllables which may aid the parsing of a speech stream. Similarly, cortical activity in the delta band (1–4 Hz) entrains to the onset of words in natural speech and has been found to encode both syntactic as well as semantic information. Such neural entrainment to speech rhythms is not merely an epiphenomenon of other neural processes, but plays a functional role in speech processing: modulating the neural entrainment through transcranial alternating current stimulation influences the speech-related neural activity and modulates the comprehension of degraded speech. However, the distinct functional contributions of the delta- and of the theta-band entrainment to the modulation of speech comprehension have not yet been investigated. Here we use transcranial alternating current stimulation with waveforms derived from the speech envelope and filtered in the delta and theta frequency bands to alter cortical entrainment in both bands separately. We find that transcranial alternating current stimulation in the theta band but not in the delta band impacts speech comprehension. Moreover, we find that transcranial alternating current stimulation with the theta-band portion of the speech envelope can improve speech-in-noise comprehension beyond sham stimulation. Our results show a distinct contribution of the theta- but not of the delta-band stimulation to the modulation of speech comprehension. In addition, our findings open up a potential avenue of enhancing the comprehension of speech in noise.
Vanheusden F, Kegler M, Ireland K, et al., 2020, Hearing aids do not alter cortical entrainment to speech at audible levels in mild-to-moderately hearing-impaired subjects, Frontiers in Human Neuroscience, Vol: 14, Pages: 1-13, ISSN: 1662-5161
Background: Cortical entrainment to speech correlates with speech intelligibility and attention to a speech stream in noisy environments. However, there is a lack of data on whether cortical entrainment can help in evaluating hearing aid fittings for subjects with mild to moderate hearing loss. One particular problem that may arise is that hearing aids may alter the speech stimulus during (pre-)processing steps, which might alter cortical entrainment to the speech. Here, the effect of hearing aid processing on cortical entrainment to running speech in hearing impaired subjects was investigated.Methodology: Seventeen native English-speaking subjects with mild-to-moderate hearing loss participated in the study. Hearing function and hearing aid fitting were evaluated using standard clinical procedures. Participants then listened to a 25-min audiobook under aided and unaided conditions at 70 dBA sound pressure level (SPL) in quiet conditions. EEG data were collected using a 32-channel system. Cortical entrainment to speech was evaluated using decoders reconstructing the speech envelope from the EEG data. Null decoders, obtained from EEG and the time-reversed speech envelope, were used to assess the chance level reconstructions. Entrainment in the delta- (1–4 Hz) and theta- (4–8 Hz) band, as well as wideband (1–20 Hz) EEG data was investigated.Results: Significant cortical responses could be detected for all but one subject in all three frequency bands under both aided and unaided conditions. However, no significant differences could be found between the two conditions in the number of responses detected, nor in the strength of cortical entrainment. The results show that the relatively small change in speech input provided by the hearing aid was not sufficient to elicit a detectable change in cortical entrainment.Conclusion: For subjects with mild to moderate hearing loss, cortical entrainment to speech in quiet at an audible level is not affected by he
Reichenbach J, Kadir S, Kaza C, et al., 2020, Modulation of speech-in-noise comprehension through transcranial current stimulation with the phase-shifted speech envelope, IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol: 28, Pages: 23-31, ISSN: 1534-4320
Neural activity tracks the envelope of a speech signalat latencies from 50 ms to 300 ms. Modulating this neural trackingthrough transcranial alternating current stimulation influencesspeech comprehension. Two important variables that can affectthis modulation are the latency and the phase of the stimulationwith respect to the sound. While previous studies have found aninfluence of both variables on speech comprehension, theinteraction between both has not yet been measured. We presented17 subjects with speech in noise coupled with simultaneoustranscranial alternating current stimulation. The currents werebased on the envelope of the target speech but shifted by differentphases, as well as by two temporal delays of 100 ms and 250 ms.We also employed various control stimulations, and assessed thesignal-to-noise ratio at which the subject understood half of thespeech. We found that, at both latencies, speech comprehension ismodulated by the phase of the current stimulation. However, theform of the modulation differed between the two latencies. Phaseand latency of neurostimulation have accordingly distinctinfluences on speech comprehension. The different effects at thelatencies of 100 ms and 250 ms hint at distinct neural processes forspeech processing.
Weissbart H, Reichenbach J, Kandylaki K, 2020, Cortical tracking of surprisal during continuous speech comprehension, Journal of Cognitive Neuroscience, Vol: 32, Pages: 155-166, ISSN: 0898-929X
Speech comprehension requires rapid online processing of a continuous acoustic signal to extract structure and meaning. Previous studies on sentence comprehension have found neural correlates of the predictability of a word given its context, as well as a of the precision of such a prediction. However, they have focussed on single sentences and on particular words in those sentences. Moreover, they compared neural responses to words with low and high predictability, as well as with low and high precision. However, in speech comprehension a listener hears many successive words whose predictability and precision vary over a large range. Here we show that cortical activity in different frequency bands tracks word surprisal in continuous natural speech, and that this tracking is modulated by precision. We obtain these results through quantifying surprisal and precision from naturalistic speech using a deep neural network, and through relating these speech features to electroencephalographic (EEG) responses of human volunteers acquired during auditory story comprehension. We find significant cortical tracking of surprisal at low frequencies including the delta band as well as in the higher-frequency beta and gamma bands, and observe that the tracking is modulated by the precision. Our results pave the way to further investigate the neurobiology of natural speech comprehension.
Etard O, Kegler M, Braiman C, et al., 2019, Decoding of selective attention to continuous speech from the human auditory brainstem response, NeuroImage, Vol: 200, Pages: 1-11, ISSN: 1053-8119
Humans are highly skilled at analysing complex acoustic scenes. The segregation of different acoustic streams and the formation of corresponding neural representations is mostly attributed to the auditory cortex. Decoding of selective attention from neuroimaging has therefore focussed on cortical responses to sound. However, the auditory brainstem response to speech is modulated by selective attention as well, as recently shown through measuring the brainstem's response to running speech. Although the response of the auditory brainstem has a smaller magnitude than that of the auditory cortex, it occurs at much higher frequencies and therefore has a higher information rate. Here we develop statistical models for extracting the brainstem response from multi-channel scalp recordings and for analysing the attentional modulation according to the focus of attention. We demonstrate that the attentional modulation of the brainstem response to speech can be employed to decode the attentional focus of a listener from short measurements of 10 s or less in duration. The decoding remains accurate when obtained from three EEG channels only. We further show how out-of-the-box decoding that employs subject-independent models, as well as decoding that is independent of the specific attended speaker is capable of achieving similar accuracy. These results open up new avenues for investigating the neural mechanisms for selective attention in the brainstem and for developing efficient auditory brain-computer interfaces.
Saiz Alia M, Forte A, Reichenbach J, 2019, Individual differences in the attentional modulation of the human auditory brainstem response to speech inform on speech-in-noise deficits, Scientific Reports, Vol: 9, ISSN: 2045-2322
People with normal hearing thresholds can nonetheless have difficulty with understanding speech in noisy backgrounds. The origins of such supra-threshold hearing deficits remain largely unclear. Previously we showed that the auditory brainstem response to running speech is modulated by selective attention, evidencing a subcortical mechanism that contributes to speech-in-noise comprehension. We observed, however, significant variation in the magnitude of the brainstem’s attentional modulation between the different volunteers. Here we show that this variability relates to the ability of the subjects to understand speech in background noise. In particular, we assessed 43 young human volunteers with normal hearing thresholds for their speech-in-noise comprehension. We also recorded their auditory 30brainstem responses to running speech when selectively attending to one of two competing voices. To control for potential peripheral hearing deficits, and in particular for cochlear synaptopathy, we further assessed noise exposure, the temporal sensitivity threshold, the middle-ear muscle reflex, and the auditory-brainstem response to clicks in various levels of background noise. These tests did not show evidence for cochlear synaptopathy amongst the volunteers. Furthermore, we found that only the attentional modulation of the brainstem response to speech was significantly related to speech-in-noise comprehension. Our results therefore evidence an impact of top-down modulation of brainstem activity on the variability in speech-in-noise comprehension amongst the subjects.
Etard O, Reichenbach J, 2019, Neural speech tracking in the theta and in the delta frequency band differentially encode clarity and comprehension of speech in noise, Journal of Neuroscience, Vol: 39, Pages: 5750-5759, ISSN: 0270-6474
Humans excel at understanding speech even in adverse conditions such as background noise. Speech processing may be aided by cortical activity in the delta and theta frequency bands that has been found to track the speech envelope. However, the rhythm of non-speech sounds is tracked by cortical activity as well. It therefore remains unclear which aspects of neural speech tracking represent the processing of acoustic features, related to the clarity of speech, and which aspects reflect higher-level linguistic processing related to speech comprehension. Here we disambiguate the roles of cortical tracking for speech clarity and comprehension through recording EEG responses to native and foreign language in different levels of background noise, for which clarity and comprehension vary independently. We then use a both a decoding and an encoding approach to relate clarity and comprehension to the neural responses. We find that cortical tracking in the theta frequency band is mainly correlated to clarity, while the delta band contributes most to speech comprehension. Moreover, we uncover an early neural component in the delta band that informs on comprehension and that may reflect a predictive mechanism for language processing. Our results disentangle the functional contributions of cortical speech tracking in the delta and theta bands to speech processing. They also show that both speech clarity and comprehension can be accurately decoded from relatively short segments of EEG recordings, which may have applications in future mind-controlled auditory prosthesis.
BinKhamis G, Forte AE, Reichenbach J, et al., 2019, Speech auditory brainstem responses in adult hearing aid users: Effects of aiding and background noise, and prediction of behavioral measures, Trends in Hearing, Vol: 23, Pages: 1-20, ISSN: 2331-2165
Evaluation of patients who are unable to provide behavioral responses on standard clinical measures is challenging due to the lack of standard objective (non-behavioral) clinical audiological measures that assess the outcome of an intervention (e.g. hearing aids). Brainstem responses to short consonant-vowel stimuli (speech-ABRs) have been proposed as a measure of subcortical encoding of speech, speech detection, and speech-in-noise performance in individuals with normal hearing. Here, we investigated the potential of speech-ABRs as an objective clinical outcome measure of speech detection, speech-in-noise detection and recognition, and self-reported speech understanding in adults with sensorineural hearing loss. We compared aided and unaided speech-ABRs, and speech-ABRs in quiet and in noise. Additionally, we evaluated whether speech-ABR F0 encoding (obtained from the complex cross-correlation with the 40 ms [da] fundamental waveform) predicted aided behavioral speech recognition in noise and/or aided self-reported speech understanding. Results showed: (i) aided speech-ABRs had earlier peak latencies, larger peak amplitudes, and larger F0 encoding amplitudes compared to unaided speech-ABRs; (ii) the addition of background noise resulted in later F0 encoding latencies, but did not have an effect on peak latencies and amplitudes, or on F0 encoding amplitudes; and (iii) speech-ABRs were not a significant predictor of any of the behavioral or self-report measures. These results show thatspeech-ABR F0 encoding is not a good predictor of speech-in-noise recognition or self reported speech understanding with hearing aids. However, our results suggest that speech- ABRs may have potential for clinical application as an objective measure of speech detection with hearing aids.
Braiman C, Fridman A, Conte MM, et al., 2018, Cortical response to the natural speech envelope correlates with neuroimaging evidence of cognition in severe brain injury, Current Biology, Vol: 28, Pages: 3833-3839.E3, ISSN: 0960-9822
Recent studies identify severely brain-injured patients with limited or no behavioral responses who successfully perform functional magnetic resonance imaging (fMRI) or electroencephalogram (EEG) mental imagery tasks [1, 2, 3, 4, 5]. Such tasks are cognitively demanding ; accordingly, recent studies support that fMRI command following in brain-injured patients associates with preserved cerebral metabolism and preserved sleep-wake EEG [5, 6]. We investigated the use of an EEG response that tracks the natural speech envelope (NSE) of spoken language [7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22] in healthy controls and brain-injured patients (vegetative state to emergence from minimally conscious state). As audition is typically preserved after brain injury, auditory paradigms may be preferred in searching for covert cognitive function [23, 24, 25]. NSE measures are obtained by cross-correlating EEG with the NSE. We compared NSE latencies and amplitudes with and without consideration of fMRI assessments. NSE latencies showed significant and progressive delay across diagnostic categories. Patients who could carry out fMRI-based mental imagery tasks showed no statistically significant difference in NSE latencies relative to healthy controls; this subgroup included patients without behavioral command following. The NSE may stratify patients with severe brain injuries and identify those patients demonstrating “cognitive motor dissociation” (CMD)  who show only covert evidence of command following utilizing neuroimaging or electrophysiological methods that demand high levels of cognitive function. Thus, the NSE is a passive measure that may provide a useful screening tool to improve detection of covert cognition with fMRI or other methods and improve stratification of patients with disorders of consciousness in research studies.
Forte AE, Etard OE, Reichenbach JDT, 2018, Selective Auditory Attention At The Brainstem Level, ARO 2018
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.