Imperial College London

Mr Mike Brookes

Faculty of EngineeringDepartment of Electrical and Electronic Engineering

Emeritus Reader
 
 
 
//

Contact

 

+44 (0)20 7594 6165mike.brookes Website

 
 
//

Assistant

 

Miss Vanessa Rodriguez-Gonzalez +44 (0)20 7594 6267

 
//

Location

 

807aElectrical EngineeringSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@article{Wang:2017:10.1109/TASLP.2017.2786863,
author = {Wang, Y and Brookes, DM},
doi = {10.1109/TASLP.2017.2786863},
journal = {IEEE/ACM Transactions on Audio, Speech and Language Processing},
pages = {580--594},
title = {Model-Based Speech Enhancement in the Modulation Domain},
url = {http://dx.doi.org/10.1109/TASLP.2017.2786863},
volume = {26},
year = {2017}
}

RIS format (EndNote, RefMan)

TY  - JOUR
AB - This paper presents an algorithm for modulationdomain speech enhancement using a Kalman filter. The proposed estimator jointly models the estimated dynamics of the spectral amplitudes of speech and noise to obtain an MMSE estimation of the speech amplitude spectrum with the assumption that the speech and noise are additive in the complex domain. In order to include the dynamics of noise amplitudes with those of speech amplitudes, we propose a statistical “Gaussring” model that comprises a mixture of Gaussians whose centres lie in a circle on the complex plane. The performance of the proposed algorithm is evaluated using the perceptual evaluation of speech quality (PESQ) measure, segmental SNR (segSNR) measure and shorttime objective intelligibility (STOI) measure. For speech quality measures, the proposed algorithm is shown to give a consistent improvement over a wide range of SNRs when compared to competitive algorithms. Speech recognition experiments also show that the Gaussring model based algorithm performs well for two types of noise.
AU - Wang,Y
AU - Brookes,DM
DO - 10.1109/TASLP.2017.2786863
EP - 594
PY - 2017///
SN - 2329-9304
SP - 580
TI - Model-Based Speech Enhancement in the Modulation Domain
T2 - IEEE/ACM Transactions on Audio, Speech and Language Processing
UR - http://dx.doi.org/10.1109/TASLP.2017.2786863
UR - http://hdl.handle.net/10044/1/55599
VL - 26
ER -