Results
- Showing results for:
- Reset all filters
Search results
-
Journal articleHao C, Rassoul B, Clerckx B, 2017,
Achievable DoF regions of MIMO networks with imperfect CSIT
, IEEE Transactions on Information Theory, Vol: 63, Pages: 6587-6606, ISSN: 1557-9654We focus on a two-receiver Multiple-Input-Multiple-Output (MIMO) Broadcast Channel (BC) and InterferenceChannel (IC) with an arbitrary number of antennas at eachnode. We assume an imperfect knowledge of local Channel StateInformation at the Transmitters, whose error decays with theSignal-to-Noise-Ratio. With such configuration, we characterizethe achievable Degrees-of-Freedom (DoF) regions in both BC andIC, by proposing a Rate-Splitting (RS) approach, which divideseach receiver’s message into a common part and a private part.Compared to the RS scheme designed for the symmetric MIMOcase, the novelties of the proposed block lie in 1) deliveringadditional non-ZF-precoded private symbols to the receiver withthe greater number of antennas, and 2) a Space-Time implemen-tation. These features provide more flexibilities in balancing thecommon-message-decodabilities at the two receivers, and fullyexploit asymmetric antenna arrays. Besides, in IC, we modifythe power allocation designed for the asymmetric BC based onthe signal space where the two transmitted signals interfere witheach other. We also derive an outer-bound for the DoF regionsand show that the proposed achievable DoF regions are optimalunder some antenna configurations and CSIT qualities.
-
Journal articleMachen A, Wang S, Leung KK, et al., 2017,
Live Service Migration in Mobile Edge Clouds
, IEEE WIRELESS COMMUNICATIONS, Vol: 25, Pages: 140-147, ISSN: 1536-1284Mobile edge clouds (MECs) bring the benefits of the cloud closer to the user, by installing small cloud infrastructures at the network edge. This enables a new breed of real-time applications, such as instantaneous object recognition and safety assistance in intelligent transportation systems, that require very low latency. One key issue that comes with proximity is how to ensure that users always receive good performance as they move across different locations. Migrating services between MECs is seen as the means to achieve this. This article presents a layered framework for migrating active service applications that are encapsulated either in virtual machines (VMs) or containers. This layering approach allows a substantial reduction in service downtime. The framework is easy to implement using readily available technologies, and one of its key advantages is that it supports containers, which is a promising emerging technology that offers tangible benefits over VMs. The migration performance of various real applications is evaluated by experiments under the presented framework. Insights drawn from the experimentation results are discussed.
-
Conference paperGabillard T, Sridhar V, Manikas A, 2017,
Capacity loss and antenna array geometry
, IEEE International Conference on Communications (ICC), Publisher: IEEE, ISSN: 1938-1883The impact of antenna array geometry on it’s ability to migrate interference, and hence the channel capacity, is a topic that is seldom studied and is crucial to future systems that will employ large arrays. In this paper, for the worst-case scenario where interferers are located spatially close to the desired users, the “capacity loss” is defined and expressed as a function of array geometry and propaganda environment. Based on the analytical results, simulation studies of the capacity loss are presented for different array geometries and various key insights on antenna array design are highlighted.
-
Journal articleGoverdovsky V, von Rosenberg W, Nakamura T, et al., 2017,
Hearables: multimodal physiological in-ear sensing
, Scientific Reports, Vol: 7, ISSN: 2045-2322Future health systems require the means to assess and track the neural and physiological function of a user over long periods of time, and in the community. Human body responses are manifested through multiple, interacting modalities – the mechanical, electrical and chemical; yet, current physiological monitors (e.g. actigraphy, heart rate) largely lack in cross-modal ability, are inconvenient and/or stigmatizing. We address these challenges through an inconspicuous earpiece, which benefits from the relatively stable position of the ear canal with respect to vital organs. Equipped with miniature multimodal sensors, it robustly measures the brain, cardiac and respiratory functions. Comprehensive experiments validate each modality within the proposed earpiece, while its potential in wearable health monitoring is illustrated through case studies spanning these three functions. We further demonstrate how combining data from multiple sensors within such an integrated wearable device improves both the accuracy of measurements and the ability to deal with artifacts in real-world scenarios.
-
Conference paperElMikaty M, Stathaki P, 2017,
Detection of cars in complex urban areas
, IAPR Conference on Machine Vision Applications, Publisher: IEEEDetection of cars in airborne images of typical urbanareas has various applications in several domains,such as surveillance, military and remote sensing. Itis a tremendously-challenging problem, mainly becauseof the significant inter-class similarity among variousobjects in urban environments. In this paper, a novelframework is introduced that adopts a sliding-windowapproach and it depicts, in a novel way, the local distributionof gradients, colours and texture. A linear supportvector machine classifier is used to differentiatebetween descriptors that belong to cars and descriptorsthat belong to other objects in a hyperspace of 3838dimensions. Descriptors are computed over a newlyproposedadaptive distribution of cells that enables theuse of various rotation-variant image descriptors. Theproposed framework has been evaluated on the Vaihingendataset and results corroborate its superiority as itachieves a higher precision for a given recall than thestate of the art.
-
Conference paperNakamura T, adjei T, alqurashi Y, et al., 2017,
Complexity science for sleep stage classification from EEG
, IEEE International Joint Conference on Neural Networks (IJCNN) 2017, Publisher: IEEE, Pages: 4387-4394, ISSN: 2161-4407Automatic sleep stage classification is an importantparadigm in computational intelligence and promises consider-able advantages to the health care. Most current automatedmethods require the multiple electroencephalogram (EEG) chan-nels and typically cannot distinguish the S1 sleep stage fromEEG. The aim of this study is to revisit automatic sleep stageclassification from EEGs using complexity science methods. Theproposed method applies fuzzy entropy and permutation entropyas kernels of multi-scale entropy analysis. To account for sleeptransition, the preceding and following 30 seconds of epoch datawere used for analysis as well as the current epoch. Combiningthe entropy and spectral edge frequency features extracted fromone EEG channel, a multi-class support vector machine (SVM)was able to classify 93.8% of 5 sleep stages for the SleepEDFdatabase [expanded], with the sensitivity of S1 stage was 49.1%.Also, the Kappa’s coefficient yielded 0.90, which indicates almostperfect agreement.
-
Journal articleWu J, Watson R, Bolla R, et al., 2017,
Guest Editorial on Green Communications, Computing, and Systems
, IEEE Systems Journal, Vol: 11, Issue:2, Pages: 546-550, ISSN: 1932-8184 -
ReportEaton DJ, Gaubitch ND, Moore AH, et al., 2017,
Acoustic Characterization of Environments (ACE) Challenge Results Technical Report
, Publisher: arXivThis document provides supplementary information, and the results of the tests of acoustic parameter estimation algorithms on the AcousticCharacterization of Environments (ACE) Challenge Evaluation dataset which were subsequently submitted and written up into papers for theProceedings of the ACE Challenge [2]. This document is supporting material for a forthcoming journal paper on the ACE Challenge which will provide further analysis of the results.
-
Journal articleNakamura T, Goverdovsky V, Morrell M, et al., 2017,
Automatic sleep monitoring using ear-EEG
, IEEE Journal of Translational Engineering in Health and Medicine, Vol: 5, ISSN: 2168-2372The monitoring of sleep patterns without patient’s inconvenience or involvement of a medical specialist is a clinical question of significant importance. To this end, we propose an automatic sleep stage monitoring system based on an affordable, unobtrusive, discreet, and long-term wearable in-ear sensor for recording the Electroencephalogram (ear-EEG). The selected features for sleep pattern classification from a single ear-EEG channel include the spectral edge frequency (SEF) and multiscale fuzzy entropy (MSFE), a structural complexity feature. In this preliminary study, the manually scored hypnograms from simultaneous scalp-EEG and ear-EEG recordings of four subjects are used as labels for two analysis scenarios: 1) classification of ear-EEG hypnogram labels from ear-EEG recordings and 2) prediction of scalp-EEG hypnogram labels from ear-EEG recordings. We consider both 2-class and 4-class sleep scoring, with the achieved accuracies ranging from 78.5% to 95.2% for ear-EEG labels predicted from ear-EEG, and 76.8% to 91.8% for scalp-EEG labels predicted from ear-EEG. The corresponding Kappa coefficients range from 0.64 to 0.83 for Scenario 1, and indicate Substantial to Almost Perfect Agreement, while for Scenario 2 the range of 0.65 to 0.80 indicates Substantial Agreement, thus further supporting the feasibility of in-ear sensing for sleep monitoring in the community.
-
Journal articleKanna S, Mandic DP, 2017,
Self-stabilising adaptive three-phase transforms via widely linear modelling
, ELECTRONICS LETTERS, Vol: 53, Pages: 875-876, ISSN: 0013-5194 -
Conference paperLiu T, Stathaki, 2017,
Fast Head-Shoulder Proposal for Scare-aware Pedestrian Detection
, International Conference on Pervasive Technologies Related to Assistive Environments -
Conference paperYiallourides C, Manning V, Moore AH, et al., 2017,
A dynamic programming approach for automatic stride detection and segmentation in acoustic emission from the knee
, 2017 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), Publisher: Institute of Electrical and Electronics Engineers (IEEE), Pages: 401-405, ISSN: 1520-6149We study the acquisition and analysis of sounds generated by the knee during walking with particular focus on the effects due to osteoarthritis. Reliable contact instant estimation is essential for stride synchronous analysis. We present a dynamic programming based algorithm for automatic estimation of both the initial contact instants (ICIs) and last contact instants (LCIs) of the foot to the floor. The technique is designed for acoustic signals sensed at the patella of the knee. It uses the phase-slope function to generate a set of candidates and then finds the most likely ones by minimizing a cost function that we define. ICIs are identified with an RMS error of 13.0% for healthy and 14.6% for osteoarthritic knees and LCIs with an RMS error of 16.0% and 17.0% respectively.
-
Conference paperLightburn L, De Sena E, Moore AH, et al., 2017,
Improving the perceptual quality of ideal binary masked speech
, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: Institute of Electrical and Electronics Engineers (IEEE), Pages: 661-665, ISSN: 1520-6149It is known that applying a time-frequency binary mask to very noisy speech can improve its intelligibility but results in poor perceptual quality. In this paper we propose a new approach to applying a binary mask that combines the intelligibility gains of conventional binary masking with the perceptual quality gains of a classical speech enhancer. The binary mask is not applied directly as a time-frequency gain as in most previous studies. Instead, the mask is used to supply prior information to a classical speech enhancer about the probability of speech presence in different time-frequency regions. Using an oracle ideal binary mask, we show that the proposed method results in a higher predicted quality than other methods of applying a binary mask whilst preserving the improvements in predicted intelligibility.
-
Conference paperHafezi S, Moore AH, Naylor P, 2017,
Multiple source localization using estimation consistency in the time-frequency domain
, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: Institute of Electrical and Electronics Engineers (IEEE), Pages: 516-520, ISSN: 1520-6149The extraction of multiple Direction-of-Arrival (DoA) information from estimated spatial spectra can be challenging when such spectra are noisy or the sources are adjacent. Smoothing or clustering techniques are typically used to remove the effect of noise or irregular peaks in the spatial spectra. As we will explain and show in this paper, the smoothing-based techniques require prior knowledge of minimum angular separation of the sources and the clustering-based techniques fail on noisy spatial spectrum. A broad class of localization techniques give direction estimates in each Time Frequency (TF) bin. Using this information as input, a novel technique for obtaining robust localization of multiple simultaneous sources is proposed using Estimation Consistency (EC) in the TF domain. The method is evaluated in the context of spherical microphone arrays. This technique does not require prior knowledge of the sources and by removing the noise in the estimated spatial spectrum makes clustering a reliable and robust technique for multiple DoA extraction from estimated spatial spectra. The results indicate that the proposed technique has the strongest robustness to separation with up to 10° median error for 5° to 180° separation for 2 and 3 sources, compared to the baseline and the state-of-the-art techniques.
-
Conference paperMoore AH, Brookes D, Naylor PA, 2017,
Robust spherical harmonic domain interpolation of spatially sampled array manifolds
, IEEE International Conference on Acoustics Speech and Signal Processing, Publisher: Institute of Electrical and Electronics Engineers (IEEE), Pages: 521-525, ISSN: 1520-6149Accurate interpolation of the array manifold is an important firststep for the acoustic simulation of rapidly moving microphone ar-rays. Spherical harmonic domain interpolation has been proposedand well studied in the context of head-related transfer functions buthas focussed on perceptual, rather than numerical, accuracy. In thispaper we analyze the effect of measurement noise on spatial aliasing.Based on this analysis we propose a method for selecting the trunca-tion orders for the forward and reverse spherical Fourier transformsgiven only the noisy samples in such a way that the interpolationerror is minimized. The proposed method achieves up to 1.7 dB im-provement over the baseline approach.
-
Conference paperEvers C, Dorfan Y, Gannot S, et al., 2017,
Source tracking using moving microphone arrays for robot audition
, IEEE Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEEIntuitive spoken dialogues are a prerequisite for human-robot inter-action. In many practical situations, robots must be able to identifyand focus on sources of interest in the presence of interfering speak-ers. Techniques such as spatial filtering and blind source separa-tion are therefore often used, but rely on accurate knowledge of thesource location. In practice, sound emitted in enclosed environmentsis subject to reverberation and noise. Hence, sound source localiza-tion must be robust to both diffuse noise due to late reverberation, aswell as spurious detections due to early reflections. For improvedrobustness against reverberation, this paper proposes a novel ap-proach for sound source tracking that constructively exploits the spa-tial diversity of a microphone array installed in a moving robot. Inprevious work, we developed speaker localization approaches usingexpectation-maximization (EM) approaches and using Bayesian ap-proaches. In this paper we propose to combine the EM and Bayesianapproach in one framework for improved robustness against rever-beration and noise.
-
Conference paperPapayiannis C, Evers C, Naylor PA, 2017,
Discriminative feature domains for reverberant acoustic environments
, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, ISSN: 2379-190XSeveral speech processing and audio data-mining applicationsrely on a description of the acoustic environment as a featurevector for classification. The discriminative properties of thefeature domain play a crucial role in the effectiveness of thesemethods. In this work, we consider three environment iden-tification tasks and the task of acoustic model selection forspeech recognition. A set of acoustic parameters and Ma-chine Learning algorithms for feature selection are used andan analysis is performed on the resulting feature domains foreach task. In our experiments, a classification accuracy of100% is achieved for the majority of tasks and the Word Er-ror Rate is reduced by 20.73 percentage points for AutomaticSpeech Recognition when using the resulting domains. Ex-perimental results indicate a significant dissimilarity in theparameter choices for the composition of the domains, whichhighlights the importance of the feature selection process forindividual applications.
-
Conference paperXue W, Brookes M, Naylor PA, 2017,
Frequency-domain under-modelled blind system identification based on cross power spectrum and sparsity regularization
, IEEE International Conference on Acoustics, Speech and Signal Processing, Pages: 591-595, ISSN: 1520-6149© 2017 IEEE. In room acoustics, under-modelled multichannel blind system identification (BSI) aims to estimate the early part of the room impulse responses (RIRs), and it can be widely used in applications such as speaker localization, room geometry identification and beamforming based speech dereverberation. In this paper we extend our recent study on under-modelled BSI from the time domain to the frequency domain, such that the RIRs can be updated frame-wise and the efficiency of Fast Fourier Transform (FFT) is exploited to reduce the computational complexity. Analogous to the cross-correlation based criterion in the time domain, a frequency-domain cross power spectrum based criterion is proposed. As the early RIRs are usually sparse, the RIRs are estimated by jointly maximizing the cross power spectrum based criterion in the frequency domain and minimizing the l 1 -norm sparsity measure in the time domain. A two-stage LMS updating algorithm is derived to achieve joint optimization of these two targets. The experimental results in different under-modelled scenarios demonstrate the effectiveness of the proposed method.
-
Journal articleClerckx B, Zawawi ZB, Huang K, 2017,
Wirelessly powered backscatter communications: waveform design and SNR-energy tradeoff
, IEEE Communications Letters, Vol: 21, Pages: 2234-2237, ISSN: 1558-2558This paper shows that wirelessly powered backscatter communications is subject to a fundamental tradeoff between the harvested energy at the tag and the reliability of the backscatter communication, measured in terms of SNR at the reader. Assuming the RF transmit signal is a multisine waveform adaptive to the channel state information, we derive a systematic approach to optimize the transmit waveform weights (amplitudes and phases) in order to enlarge as much as possible the SNR-energy region. Performance evaluations confirm the significant benefits of using multiple frequency components in the adaptive transmit multisine waveform to exploit the nonlinearity of the rectifier and a frequency diversity gain.
-
Journal articlevon Rosenberg WC, Chanwimalueang T, Adjei T, et al., 2017,
Resolving ambiguities in the LF/HF Ratio: LF-HF scatter plots for the categorization of mental and physical stress from HRV
, Frontiers in Physiology, Vol: 8, ISSN: 1664-042XIt is generally accepted that the activities of the autonomic nervous system (ANS), which consists of the sympathetic (SNS) and parasympathetic nervous systems (PNS), are reflected in the low- (LF) and high-frequency (HF) bands in heart rate variability (HRV)—while, not without some controversy, the ratio of the powers in those frequency bands, the so called LF-HF ratio (LF/HF), has been used to quantify the degree of sympathovagal balance. Indeed, recent studies demonstrate that, in general: (i) sympathovagal balance cannot be accurately measured via the ratio of the LF- and HF- power bands; and (ii) the correspondence between the LF/HF ratio and the psychological and physiological state of a person is not unique. Since the standard LF/HF ratio provides only a single degree of freedom for the analysis of this 2D phenomenon, we propose a joint treatment of the LF and HF powers in HRV within a two-dimensional representation framework, thus providing the required degrees of freedom. By virtue of the proposed 2D representation, the restrictive assumption of the linear dependence between the activity of the autonomic nervous system (ANS) and the LF-HF frequency band powers is demonstrated to become unnecessary. The proposed analysis framework also opens up completely new possibilities for a more comprehensive and rigorous examination of HRV in relation to physical and mental states of an individual, and makes possible the categorization of different stress states based on HRV. In addition, based on instantaneous amplitudes of Hilbert-transformed LF- and HF-bands, a novel approach to estimate the markers of stress in HRV is proposed and is shown to improve the robustness to artifacts and irregularities, critical issues in real-world recordings. The proposed approach for resolving the ambiguities in the standard LF/HF-ratio analyses is verified over a number of real-world stress-invoking scenarios.
-
Journal articleHe T, Gkelias A, Ma L, et al., 2017,
Robust and efficient monitor placement for network tomography in dynamic networks
, IEEE/ACM Transactions on Networking, Vol: 25, Pages: 1732-1745, ISSN: 1063-6692We consider the problem of placing the minimum number of monitors in a dynamic network to identify additive link metrics from path metrics measured along cycle-free paths between monitors. Our goal is robust monitor placement, i.e., the same set of monitors can maintain network identifiability under topology changes. Our main contribution is a set of monitor placement algorithms with different performance-complexity tradeoffs that can simultaneously identify multiple topologies occurring during the network lifetime. In particular, we show that the optimal monitor placement is the solution to a generalized hitting set problem, for which we provide a polynomial-time algorithm to construct the input and a greedy algorithm to select the monitors with logarithmic approximation. Although the optimal placement is NP-hard in general, we identify non-trivial special cases that can be solved efficiently. Our secondary contribution is a dynamic triconnected decomposition algorithm to compute the input needed by the monitor placement algorithms, which is the first such algorithm that can handle edge deletions. Our evaluations on mobility-induced dynamic topologies verify the efficiency and the robustness of the proposed algorithms.
-
Journal articleCichocki A, Anh-Huy P, Zhao Q, et al., 2017,
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations Part 2 Applications and Future Perspectives
, FOUNDATIONS AND TRENDS IN MACHINE LEARNING, Vol: 9, Pages: 431-+, ISSN: 1935-8237 -
Journal articleClerckx B, Bayguzina E, 2017,
A low-complexity adaptive multisine waveform design for wireless power transfer
, IEEE Antennas and Wireless Propagation Letters, Vol: 16, Pages: 2207-2210, ISSN: 1548-5757Channel-adaptive waveforms for Wireless Power Transfer significantly boost the DC power level at the rectifier output. However the design of those waveforms is computationally complex and does not lend itself easily to practical implementation. We here propose a low-complexity channel adaptive waveform design whose performance is very close to that of the optimal design. Performance evaluations confirm the new design’s benefits in various rectifier topologies, with gains in DC output power of 100% over conventional waveforms.
-
Conference paperRossi G, Fan Z, Chin WH, et al., 2017,
Stable clustering for Ad-Hoc vehicle networking
, IEEE Wireless Communications and Networking Conference (WCNC) 2017, ISSN: 1558-2612Vehicular ad-hoc networks (VANETs) that enable communication among vehicles and between vehicles and un- manned aerial vehicles (UAVs) and cellular base stations have re- cently attracted significant interest from the research community, due to the wide range of practical applications they can facilitate (e.g. road safety, traffic management, pollution monitoring and rescue missions). Despite this increased research activity, the high vehicle mobility in a VANET raises concerns regarding the robustness and adaptiveness of such networks to support system applications. Instead of allowing direct communications between every vehicle to UAVs or base stations, clustering methods will potentially be efficient to overcome bandwidth, power consump- tion and other resource issues. Using the clustering technique, neighbouring vehicles are grouped into clusters with a particular vehicle elected as the Custer Head (CH) in each cluster. Each vehicle communicates with UAVs or base stations through the CH of the associated cluster. Despite the potential advantages, a major challenge for clustering techniques is to maintain cluster stability in light of vehicle mobility and radio fluctuation. In this paper, we propose a Stable Clustering Algorithm for vehicular ad hoc networks (SCalE). Two novel features are incorporated into the algorithm: knowledge of the vehicles behaviour for efficient selection of CHs, and the employment of a backup CH to maintain the stability of cluster structures. By simulation methods, these are shown to increase stability and improve performance when compared to existing clustering algorithms.
-
Conference paperHuang, Liu T, Dragotti, et al., 2017,
SRHRF+: Self-Example Enhanced Single Image Super-Resolution Using Hierarchical Random Forests
, Computer Vision and Pattern Recognition Workshops -
Journal articleChanwimalueang T, Aufegger L, Adjei T, et al., 2017,
Stage call: Cardiovascular reactivity to audition stress in musicians
, PLOS ONE, Vol: 12, ISSN: 1932-6203Auditioning is at the very center of educational and professional life in music and is associated with significant psychophysical demands. Knowledge of how these demands affect cardiovascular responses to psychosocial pressure is essential for developing strategies to both manage stress and understand optimal performance states. To this end, we recorded the electrocardiograms (ECGs) of 16 musicians (11 violinists and 5 flutists) before and during performances in both low- and high-stress conditions: with no audience and in front of an audition panel, respectively. The analysis consisted of the detection of R-peaks in the ECGs to extract heart rate variability (HRV) from the notoriously noisy real-world ECGs. Our data analysis approach spanned both standard (temporal and spectral) and advanced (structural complexity) techniques. The complexity science approaches—namely, multiscale sample entropy and multiscale fuzzy entropy—indicated a statistically significant decrease in structural complexity in HRV from the low- to the high-stress condition and an increase in structural complexity from the pre-performance to performance period, thus confirming the complexity loss theory and a loss in degrees of freedom due to stress. Results from the spectral analyses also suggest that the stress responses in the female participants were more parasympathetically driven than those of the male participants. In conclusion, our findings suggest that interventions to manage stress are best targeted at the sensitive pre-performance period, before an audition begins.
-
Conference paperLöllmann HW, Moore AH, Naylor PA, et al., 2017,
Microphone array signal processing for robot audition
, 2017 Hands-free Speech Communications and Microphone Arrays (HSCMA), Publisher: IEEE, Pages: 51-55Robot audition for humanoid robots interacting naturally with humans in an unconstrained real-world environment is a hitherto unsolved challenge. The recorded microphone signals are usually distorted by background and interfering noise sources (speakers) as well as room reverberation. In addition, the movements of a robot and its actuators cause ego-noise which degrades the recorded signals significantly. The movement of the robot body and its head also complicates the detection and tracking of the desired, possibly moving, sound sources of interest. This paper presents an overview of the concepts in microphone array processing for robot audition and some recent achievements.
-
Conference paperEaton DJ, javed HA, Naylor PA, 2017,
Estimation of the perceived level of reverberation using non-intrusive single-channel variance of decay rates
, Joint Workshop on Hands-free Speech Communication and Microphone Arrays, Publisher: IEEEThe increasing processing power of hearing aids and mobile deviceshas led to the potential for incorporation of dereverberation algorithms to improve speech quality for the listener. Assessing the effectiveness of deverberation algorithms using subjective listening tests is extremely time consuming and depends on averaging out listener variations over a large number of subjects. Also, most existing instrumental measures are intrusive and require knowledge of the original signal which precludes many practical applications. In this paper we show that the proposed non-intrusive single-channel algorithm is a predictor of the perceived level of reverberation thatcorrelates well with subjective listening test results, outperforming many existing intrusive and non-intrusive measures. The algorithm requires only a single training step and has a very low computational complexity making it suitable for hearing aids and mobile telephone applications. The source code has been made freely available.
-
Conference paperHafezi S, Moore AH, Naylor PA, 2017,
Multi-source estimation consistency for improved multiple direction-of-arrival estimation
, Joint Workshop on Hands-free Speech Communication and Microphone Arrays, Publisher: IEEE, Pages: 81-85In Direction-of-Arrival (DOA) estimation for multiple sources, removal of noisy data points from a set of local DOA estimates increases the resulting estimation accuracy, especially when there are many sources and they have small angular separation. In this work, we propose a post-processing technique for the enhancement of DOA extraction from a set of local estimates using the consistency of these estimates within the time frame based on adaptive multi-source assumption. Simulations in a realistic reverberant environment with sensor noise and up to 5 sources demonstrate that the proposed technique outperforms the baseline and state-of-the-art approaches. In these tests the proposed technique had the worst average error of 9°, robustness of 5° to widely varying source separation and 3° to number of sources.
-
Conference paperDionelis N, Brookes M, 2017,
Modulation-domain speech enhancement using a kalman filter with a bayesian update of speech and noise in the log-spectral domain
, IEEE Conference on on Hands-free Speech Communication and Microphone Arrays, Publisher: IEEEWe present a Bayesian estimator that performs log-spectrum esti-mation of both speech and noise, and is used as a Bayesian Kalmanfilter update step for single-channel speech enhancement in the mod-ulation domain. We use Kalman filtering in the log-power spectraldomain rather than in the amplitude or power spectral domains. Inthe Bayesian Kalman filter update step, we define the posterior dis-tribution of the clean speech and noise log-power spectra as a two-dimensional multivariate Gaussian distribution. We utilize a Kalmanfilter observation constraint surface in the three-dimensional space,where the third dimension is the phase factor. We evaluate the re-sults of the phase-sensitive log-spectrum Kalman filter by comparingthem with the results obtained by traditional noise suppression tech-niques and by an alternative Kalman filtering technique that assumesadditivity of speech and noise in the power spectral domain.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.