Imperial College London

ProfessorBjoernSchuller

Faculty of EngineeringDepartment of Computing

Professor of Artificial Intelligence
 
 
 
//

Contact

 

+44 (0)20 7594 8357bjoern.schuller Website

 
 
//

Location

 

574Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

1004 results found

Schuller BW, Batliner A, Amiriparian S, Bergler C, Gerczuk M, Holz N, Larrouy-Maestri P, Bayerl SP, Riedhammer K, Mallol-Ragolta A, Pateraki M, Coppock H, Kiskin I, Sinka M, Roberts Set al., 2022, The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes, Publisher: ArXuc

The ACM Multimedia 2022 Computational Paralinguistics Challenge addressesfour different problems for the first time in a research competition underwell-defined conditions: In the Vocalisations and Stuttering Sub-Challenges, aclassification on human non-verbal vocalisations and speech has to be made; theActivity Sub-Challenge aims at beyond-audio human activity recognition fromsmartwatch sensor data; and in the Mosquitoes Sub-Challenge, mosquitoes need tobe detected. We describe the Sub-Challenges, baseline feature extraction, andclassifiers based on the usual ComPaRE and BoAW features, the auDeep toolkit,and deep feature extraction from pre-trained CNNs using the DeepSpectRumtoolkit; in addition, we add end-to-end sequential modelling, and alog-mel-128-BNN.

Working paper

Mira R, Vougioukas K, Ma P, Petridis S, Schuller BW, Pantic Met al., 2022, End-to-End Video-to-Speech Synthesis Using Generative Adversarial Networks, IEEE TRANSACTIONS ON CYBERNETICS, ISSN: 2168-2267

Journal article

Latif S, Rana R, Khalifa S, Jurdak R, Epps J, Schuller BWet al., 2022, Multi-Task Semi-Supervised Adversarial Autoencoding for Speech Emotion Recognition, IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, Vol: 13, Pages: 992-1004, ISSN: 1949-3045

Journal article

Zhang Y, Weninger F, Schuller B, Picard RWet al., 2022, Holistic Affect Recognition Using PaNDA: Paralinguistic Non-Metric Dimensional Analysis, IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, Vol: 13, Pages: 769-780, ISSN: 1949-3045

Journal article

Stappen L, Baird A, Lienhart M, Baetz A, Schuller Bet al., 2022, An Estimation of Online Video User Engagement From Features of Time- and Value-Continuous, Dimensional Emotions, FRONTIERS IN COMPUTER SCIENCE, Vol: 4

Journal article

Ren Z, Chang Y, Bartl-Pokorny KD, Pokorny FB, Schuller BWet al., 2022, The Acoustic Dissection of Cough: Diving into Machine Listening-based COVID-19 Analysis and Detection

<jats:title>Abstract</jats:title><jats:sec><jats:title>Purpose</jats:title><jats:p>The coronavirus disease 2019 (COVID-19) has caused a crisis worldwide. Amounts of efforts have been made to prevent and control COVID-19’s transmission, from early screenings to vaccinations and treatments. Recently, due to the spring up of many automatic disease recognition applications based on machine listening techniques, it would be fast and cheap to detect COVID-19 from recordings of cough, a key symptom of COVID-19. To date, knowledge on the acoustic characteristics of COVID-19 cough sounds is limited, but would be essential for structuring effective and robust machine learning models. The present study aims to explore acoustic features for distinguishing COVID-19 positive individuals from COVID-19 negative ones based on their cough sounds.</jats:p></jats:sec><jats:sec><jats:title>Methods</jats:title><jats:p>With the theory of computational paralinguistics, we analyse the acoustic correlates of COVID-19 cough sounds based on the COMPARE feature set, i. e., a standardised set of 6,373 acoustic higher-level features. Furthermore, we train automatic COVID-19 detection models with machine learning methods and explore the latent features by evaluating the contribution of all features to the COVID-19 status predictions.</jats:p></jats:sec><jats:sec><jats:title>Results</jats:title><jats:p>The experimental results demonstrate that a set of acoustic parameters of cough sounds, e. g., statistical functionals of the root mean square energy and Mel-frequency cepstral coefficients, are relevant for the differentiation between COVID-19 positive and COVID-19 negative cough samples. Our automatic COVID-19 detection model performs significantly above chance level, i. e., at an unweighted average recall (UAR) of 0.632, on a data set consisting of 1,411 cough samples (COVID-19 positiv

Journal article

Bartl-Pokorny KD, Pokorny FB, Garrido D, Schuller BW, Zhang D, Marschik PBet al., 2022, Vocalisation Repertoire at the End of the First Year of Life: An Exploratory Comparison of Rett Syndrome and Typical Development, JOURNAL OF DEVELOPMENTAL AND PHYSICAL DISABILITIES, ISSN: 1056-263X

Journal article

Lefter I, Baird A, Stappen L, Schuller BWet al., 2022, A Cross-Corpus Speech-Based Analysis of Escalating Negative Interactions, FRONTIERS IN COMPUTER SCIENCE, Vol: 4

Journal article

Milling M, Bartl-Pokorny KD, Schuller BW, 2022, Investigating Automatic Speech Emotion Recognition for Children with Autism Spectrum Disorder in interactive intervention sessions with the social robot Kaspar

<jats:title>ABSTRACT</jats:title><jats:p>In this contribution, we present the analyses of vocalisation data recorded in the first observation round of the European Commission’s Erasmus Plus project “EMBOA, Affective loop in Socially Assistive Robotics as an intervention tool for children with autism”. In total, the project partners recorded data in 112 robot-supported intervention sessions for children with autism spectrum disorder. Audio data were recorded using the internal and lapel microphone of the H4n Pro Recorder. To analyse the data, we first utilise a child voice activity detection (VAD) system in order to extract child vocalisations from the raw audio data. For each child, session, and microphone, we provide the total time child vocalisations were detected. Next, we compare the results of two different implementations for valence- and arousal-based speech emotion recognition, thereby processing (1) the child vocalisations detected by the VAD and (2) the total recorded audio material. We provide average valence and arousal values for each session and condition. Finally, we discuss challenges and limitations of child voice detection and audio-based emotion recognition in robot-supported intervention settings.</jats:p>

Journal article

Milling M, Baird A, Bartl-Pokorny KD, Liu S, Alcorn AM, Shen J, Tavassoli T, Ainger E, Pellicano E, Pantic M, Cummins N, Schuller BWet al., 2022, Evaluating the Impact of Voice Activity Detection on Speech Emotion Recognition for Autistic Children, FRONTIERS IN COMPUTER SCIENCE, Vol: 4

Journal article

SCHULLER BJORNW, ELDAR YONINA, PANTIC MAJA, NARAYANAN SHRIKANTH, VIRTANEN TUOMAS, Tao Jet al., 2022, Editorial: Intelligent Signal Analysis for Contagious Virus Diseases, IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, Vol: 16, Pages: 159-163, ISSN: 1932-4553

Journal article

Wen S, Huang T, Schuller BW, Taher Azar Aet al., 2022, Guest Editorial Introduction to the Special Section on Efficient Network Design for Convergence of Deep Learning and Edge Computing, IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, Vol: 9, Pages: 109-110, ISSN: 2327-4697

Journal article

Xu X, Deng J, Cummins N, Zhang Z, Zhao L, Schuller BWet al., 2022, Exploring Zero-Shot Emotion Recognition in Speech Using Semantic-Embedding Prototypes, IEEE TRANSACTIONS ON MULTIMEDIA, Vol: 24, Pages: 2752-2765, ISSN: 1520-9210

Journal article

Lu C, Zong Y, Zheng W, Li Y, Tang C, Schuller BWet al., 2022, Domain Invariant Feature Learning for Speaker-Independent Speech Emotion Recognition, IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, Vol: 30, Pages: 2217-2230, ISSN: 2329-9290

Journal article

Niu M, Zhao Z, Tao J, Li Y, Schuller BWet al., 2022, Selective Element and Two Orders Vectorization Networks for Automatic Depression Severity Diagnosis via Facial Changes, IEEE Transactions on Circuits and Systems for Video Technology, Pages: 1-1, ISSN: 1051-8215

Journal article

Milling M, Pokorny FB, Bartl-Pokorny KD, Schuller BWet al., 2022, Is Speech the New Blood? Recent Progress in AI-Based Disease Detection From Audio in a Nutshell., Front Digit Health, Vol: 4

In recent years, advancements in the field of artificial intelligence (AI) have impacted several areas of research and application. Besides more prominent examples like self-driving cars or media consumption algorithms, AI-based systems have further started to gain more and more popularity in the health care sector, however whilst being restrained by high requirements for accuracy, robustness, and explainability. Health-oriented AI research as a sub-field of digital health investigates a plethora of human-centered modalities. In this article, we address recent advances in the so far understudied but highly promising audio domain with a particular focus on speech data and present corresponding state-of-the-art technologies. Moreover, we give an excerpt of recent studies on the automatic audio-based detection of diseases ranging from acute and chronic respiratory diseases via psychiatric disorders to developmental disorders and neurodegenerative disorders. Our selection of presented literature shows that the recent success of deep learning methods in other fields of AI also more and more translates to the field of digital health, albeit expert-designed feature extractors and classical ML methodologies are still prominently used. Limiting factors, especially for speech-based disease detection systems, are related to the amount and diversity of available data, e. g., the number of patients and healthy controls as well as the underlying distribution of age, languages, and cultures. Finally, we contextualize and outline application scenarios of speech-based disease detection systems as supportive tools for health-care professionals under ethical consideration of privacy protection and faulty prediction.

Journal article

Amiriparian S, Hübner T, Karas V, Gerczuk M, Ottl S, Schuller BWet al., 2022, DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing From Decentralized Data., Front Artif Intell, Vol: 5

Deep neural speech and audio processing systems have a large number of trainable parameters, a relatively complex architecture, and require a vast amount of training data and computational power. These constraints make it more challenging to integrate such systems into embedded devices and utilize them for real-time, real-world applications. We tackle these limitations by introducing DeepSpectrumLite, an open-source, lightweight transfer learning framework for on-device speech and audio recognition using pre-trained image Convolutional Neural Networks (CNNs). The framework creates and augments Mel spectrogram plots on the fly from raw audio signals which are then used to finetune specific pre-trained CNNs for the target classification task. Subsequently, the whole pipeline can be run in real-time with a mean inference lag of 242.0 ms when a DenseNet121 model is used on a consumer-grade Motorola moto e7 plus smartphone. DeepSpectrumLite operates decentralized, eliminating the need for data upload for further processing. We demonstrate the suitability of the proposed transfer learning approach for embedded audio signal processing by obtaining state-of-the-art results on a set of paralinguistic and general audio tasks, including speech and music emotion recognition, social signal processing, COVID-19 cough and COVID-19 speech analysis, and snore sound classification. We provide an extensive command-line interface for users and developers which is comprehensively documented and publicly available at https://github.com/DeepSpectrum/DeepSpectrumLite.

Journal article

Xu X, Deng J, Zhang Z, Fan X, Zhao L, Devillers L, Schuller BWet al., 2021, Rethinking Auditory Affective Descriptors Through Zero-Shot Emotion Recognition in Speech, IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, ISSN: 2329-924X

Journal article

Baird A, Triantafyllopoulos A, Zaenkert S, Ottl S, Christ L, Stappen L, Konzok J, Sturmbauer S, Messner E-M, Kudielka BM, Rohleder N, Baumeister H, Schuller BWet al., 2021, An Evaluation of Speech-Based Recognition of Emotional and Physiological Markers of Stress, FRONTIERS IN COMPUTER SCIENCE, Vol: 3

Journal article

Coppock H, Jones L, Kiskin I, Schuller Bet al., 2021, Bias and privacy in AI's cough-based COVID-19 recognition, LANCET DIGITAL HEALTH, Vol: 3, Pages: E761-E761

Journal article

Qian K, Schmitt M, Zheng H, Koike T, Han J, Liu J, Ji W, Duan J, Song M, Yang Z, Ren Z, Liu S, Zhang Z, Yamamoto Y, Schuller BWet al., 2021, Computer Audition for Fighting the SARS-CoV-2 Corona Crisis-Introducing the Multitask Speech Corpus for COVID-19, IEEE INTERNET OF THINGS JOURNAL, Vol: 8, Pages: 16035-16046, ISSN: 2327-4662

Journal article

Han J, Zhang Z, Mascolo C, Andre E, Tao J, Zhao Z, Schuller BWet al., 2021, Deep Learning for Mobile Mental Health: Challenges and recent advances, IEEE SIGNAL PROCESSING MAGAZINE, Vol: 38, Pages: 96-105, ISSN: 1053-5888

Journal article

Schuller BW, Picard R, Andre E, Gratch J, Tao Jet al., 2021, Intelligent Signal Processing for Affective Computing [From the Guest Editors], IEEE SIGNAL PROCESSING MAGAZINE, Vol: 38, Pages: 9-11, ISSN: 1053-5888

Journal article

Schuller B, Baird A, Gebhard A, Amiriparian S, Keren G, Schmitt M, Cummins Net al., 2021, New Avenues in Audio Intelligence: Towards Holistic Real-life Audio Understanding, TRENDS IN HEARING, Vol: 25, ISSN: 2331-2165

Journal article

Mohamed MM, Nessiem MA, Batliner A, Bergler C, Hantke S, Schmitt M, Baird A, Mallol-Ragolta A, Karas V, Amiriparian S, Schuller BWet al., 2021, Face mask recognition from audio: The MASC database and an overview on the mask challenge, PATTERN RECOGNITION, Vol: 122, ISSN: 0031-3203

Journal article

Schadenberg BR, Reidsma D, Evers V, Davison DP, Li JJ, Heylen DKJ, Neves C, Alvito P, Shen J, Panti M, Schuller BW, Cummins N, Olaru V, Sminchisescu C, Dimitrijevic SB, Petrovic S, Baranger A, Williams A, Alcorn AM, Pellicano Eet al., 2021, Predictable Robots for Autistic Children-Variance in Robot Behaviour, Idiosyncrasies in Autistic Children's Characteristics, and Child-Robot Engagement, ACM TRANSACTIONS ON COMPUTER-HUMAN INTERACTION, Vol: 28, ISSN: 1073-0516

Journal article

Deshpande G, Batliner A, Schuller BW, 2021, AI-Based human audio processing for COVID-19: A comprehensive overview, PATTERN RECOGNITION, Vol: 122, ISSN: 0031-3203

Journal article

Qian K, Koike T, Nakamura T, Schuller B, Yamamoto Yet al., 2021, Learning Multimodal Representations for Drowsiness Detection, IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, ISSN: 1524-9050

Journal article

Coppock H, Jones L, Kiskin I, Schuller Bet al., 2021, COVID-19 detection from audio: seven grains of salt, LANCET DIGITAL HEALTH, Vol: 3, Pages: E537-E538

Journal article

Quadrianto N, Schuller BW, Lattimore FR, 2021, Editorial: Ethical Machine Learning and Artificial Intelligence, FRONTIERS IN BIG DATA, Vol: 4

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00672433&limit=30&person=true