Imperial College London

Dr Patrick A. Naylor

Faculty of EngineeringDepartment of Electrical and Electronic Engineering

Professor of Speech & Acoustic Signal Processing
 
 
 
//

Contact

 

+44 (0)20 7594 6235p.naylor Website

 
 
//

Location

 

803Electrical EngineeringSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@inproceedings{Sharma,
author = {Sharma, D and Hogg, A and Wang, Y and Nour-Eldin, A and Naylor, P},
publisher = {IEEE},
title = {Non-Intrusive POLQA estimation of speech quality using recurrent neural networks},
url = {http://hdl.handle.net/10044/1/72098},
}

RIS format (EndNote, RefMan)

TY  - CPAPER
AB - Estimating the quality of speech without the use of a clean reference signal is a challenging problem, in part due to the time and expense required to collect sufficient training data for modern machine learning algorithms. We present a novel, non-intrusive estimator that exploits recurrent neural network architectures to predict the intrusive POLQA score of a speech signal in a short time context. The predictor is based on a novel compressed representation of modulation domain features, used in conjunction with static MFCC features. We show that the proposed method can reliably predict POLQA with a 300 ms context, achieving a mean absolute error of 0.21 on unseen data.The proposed method is trained using English speech and is shown to generalize well across unseen languages. The neural network also jointly estimates the mean voice activity detection(VAD) with an F1 accuracy score of 0.9, removing the need for an external VAD.
AU - Sharma,D
AU - Hogg,A
AU - Wang,Y
AU - Nour-Eldin,A
AU - Naylor,P
PB - IEEE
TI - Non-Intrusive POLQA estimation of speech quality using recurrent neural networks
UR - http://hdl.handle.net/10044/1/72098
ER -