Imperial College London

Patrick A. Naylor

Faculty of EngineeringDepartment of Electrical and Electronic Engineering

Professor of Speech & Acoustic Signal Processing
 
 
 
//

Contact

 

+44 (0)20 7594 6235p.naylor Website

 
 
//

Location

 

803Electrical EngineeringSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@inproceedings{Sharma:2021:10.23919/eusipco47968.2020.9287785,
author = {Sharma, D and Berger, L and Quillen, C and Naylor, PA},
doi = {10.23919/eusipco47968.2020.9287785},
pages = {446--450},
publisher = {IEEE},
title = {Non-intrusive estimation of speech signal parameters using a frame-based machine learning approach},
url = {http://dx.doi.org/10.23919/eusipco47968.2020.9287785},
year = {2021}
}

RIS format (EndNote, RefMan)

TY  - CPAPER
AB - We present a novel, non-intrusive method that jointly estimates acoustic signal properties associated with the perceptual speech quality, level of reverberation and noise in a speech signal. We explore various machine learning frameworks, consisting of popular feature extraction front-ends and two types of regression models and show the trade-off in performance that must be considered with each combination. We show that a short-time framework consisting of an 80-dimension log-Mel filter bank feature front-end employing spectral augmentation, followed by a 3 layer LSTM recurrent neural network model achieves a mean absolute error of 3.3 dB for C50, 2.3 dB for segmental SNR and 0.3 for PESQ estimation on the Libri Augmented (LA) database. The internal VAD for this system achieves an F1 score of 0.93 on this data. The proposed system also achieves a 2.4 dB mean absolute error for C50 estimation on the ACE test set. Furthermore, we show how each type of acoustic parameter correlates with ASR performance in terms of ground truth labels and additionally show that the estimated C50, SNR and PESQ from our proposed method have a high correlation (greater than 0.92) with WER on the LA test set.
AU - Sharma,D
AU - Berger,L
AU - Quillen,C
AU - Naylor,PA
DO - 10.23919/eusipco47968.2020.9287785
EP - 450
PB - IEEE
PY - 2021///
SP - 446
TI - Non-intrusive estimation of speech signal parameters using a frame-based machine learning approach
UR - http://dx.doi.org/10.23919/eusipco47968.2020.9287785
UR - https://ieeexplore.ieee.org/document/9287785
ER -