Search or filter publications

Filter by type:

Filter by publication type

Filter by year:



  • Showing results for:
  • Reset all filters

Search results

  • Conference paper
    Hogg A, Evers C, Naylor P, 2019,

    Multiple Hypothesis Tracking for Overlapping Speaker Segmentation

    , IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
  • Conference paper
    Sharma D, Hogg A, Wang Y, Nour-Eldin A, Naylor Pet al., 2019,

    Non-Intrusive POLQA estimation of speech quality using recurrent neural networks

    , European Signal Processing Conference (EUSIPCO)
  • Conference paper
    Hogg A, Naylor P, Evers C, 2019,

    Speaker change detection using fundamental frequency with application to multi-talker segmentation

    , IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE

    This paper shows that time varying pitch properties can be used advantageously within the segmentation step of a multi-talker diarization system. First a study is conducted to verify that changes in pitch are strong indicators of changes in the speaker. It is then highlighted that an individual’s pitch is smoothly varying and, therefore, can be predicted by means of a Kalman filter. Subsequently it is shown that if the pitch is not predictable then this is most likely due to a change in the speaker. Finally, a novel system is proposed that uses this approach of pitch prediction for speaker change detection. This system is then evaluated against a commonly used MFCC segmentation system. The proposed system is shown to increase the speaker change detection rate from 43.3% to 70.5% on meetings in the AMI corpus. Therefore, there are two equally weighted contributions in this paper: 1. We address the question of whether a change in pitch is a reliable estimator of a speaker change in multi-talk meeting audio. 2. We develop a method to extract such speaker changes and test them on a widely available meeting corpus.

  • Journal article
    Moore A, Xue W, Naylor P, Brookes Det al., 2019,

    Noise covariance matrix estimation for rotating microphone arrays

    , IEEE Transactions on Audio, Speech and Language Processing, Vol: 27, Pages: 519-530, ISSN: 1558-7916

    The noise covariance matrix computed between the signals from a microphone array is used in the design of spatial filters and beamformers with applications in noise suppression and dereverberation. This paper specifically addresses the problem of estimating the covariance matrix associated with a noise field when the array is rotating during desired source activity, as is common in head-mounted arrays. We propose a parametric model that leads to an analytical expression for the microphone signal covariance as a function of the array orientation and array manifold. An algorithm for estimating the model parameters during noise-only segments is proposed and the performance shown to be improved, rather than degraded, by array rotation. The stored model parameters can then be used to update the covariance matrix to account for the effects of any array rotation that occurs when the desired source is active. The proposed method is evaluated in terms of the Frobenius norm of the error in the estimated covariance matrix and of the noise reduction performance of a minimum variance distortionless response beamformer. In simulation experiments the proposed method achieves 18 dB lower error in the estimated noise covariance matrix than a conventional recursive averaging approach and results in noise reduction which is within 0.05 dB of an oracle beamformer using the ground truth noise covariance matrix.

  • Journal article
    Campello A, Ling C, Belfiore J-C, 2018,

    Universal lattice codes for MIMO channels

    , IEEE Transactions on Information Theory, Vol: 64, Pages: 7847-7865, ISSN: 0018-9448

    We propose a coding scheme that achieves the capacity of the compound MIMO channel with algebraic lattices. Our lattice construction exploits the multiplicative structure of number fields and their group of units to absorb ill-conditioned channel realizations. To shape the constellation, a discrete Gaussian distribution over the lattice points is applied. These techniques, along with algebraic properties of the proposed lattices, are then used to construct a sub-optimal de-coupled coding scheme that achieves a constant gap to compound capacity by decoding in a lattice that does not dependent on the channel realization. The gap is characterized in terms of algebraic invariants of the code and is shown to be significantly smaller than previous schemes in the literature. We also exhibit alternative algebraic constructions that achieve the capacity of ergodic (SISO) fading channels.

  • Journal article
    Luzzi L, Vehkalahti R, Ling C, 2018,

    Almost universal codes for MIMO wiretap channels

    , IEEE Transactions on Information Theory, Vol: 64, Pages: 7218-7241, ISSN: 0018-9448

    Despite several works on secrecy coding for fading and MIMO wiretap channels from an error probability perspective, the construction of information-theoretically secure codes over such channels remains an open problem. In this paper, we consider a fading wiretap channel model where the transmitter has only partial statistical channel state information. Our channel model includes static channels, i.i.d. block fading channels, and ergodic stationary fading with fast decay of large deviations for the eavesdropper's channel. We extend the flatness factor criterion from the Gaussian wiretap channel to fading and MIMO wiretap channels, and establish a simple design criterion where the normalized product distance/minimum determinant of the lattice and its dual should be maximized simultaneously. Moreover, we propose concrete lattice codes satisfying this design criterion, which are built from algebraic number fields with constant root discriminant in the single-antenna case, and from division algebras centered at such number fields in the multipleantenna case. The proposed lattice codes achieve strong secrecy and semantic security for all rates R <; C b - C e - κ, where C b and C e are Bob and Eve's channel capacities, respectively, and κ is an explicit constant gap. Furthermore, these codes are almost universal in the sense that a fixed code is good for secrecy for a wide range of fading models. Finally, we consider a compound wiretap model with a more restricted uncertainty set, and show that rates R <; C̅ b - C̅ e - κ are achievable, where C̅ b is a lower bound for Bob's capacity and C̅ e is an upper bound for Eve's capacity for all

  • Journal article
    Stott AE, Kanna S, Mandic DP, 2018,

    Widely linear complex partial least squares for latent subspace regression

    , SIGNAL PROCESSING, Vol: 152, Pages: 350-362, ISSN: 0165-1684
  • Journal article
    Dragotti P, Huang J, 2018,

    Photo realistic image completion via dense correspondence

    , IEEE Transactions on Image Processing, Vol: 27, Pages: 5234-5247, ISSN: 1057-7149

    In this paper, we propose an image completion algorithm based on dense correspondence between the input image and an exemplar image retrieved from the Internet. Contrary to traditional methods which register two images according to sparse correspondence, in this paper, we propose a hierarchical PatchMatch method that progressively estimates a dense correspondence, which is able to capture small deformations between images. The estimated dense correspondence has usually large occlusion areas that correspond to the regions to be completed. A nearest neighbor field (NNF) interpolation algorithm interpolates a smooth and accurate NNF over the occluded region. Given the calculated NNF, the correct image content from the exemplar image is transferred to the input image. Finally, as there could be a color difference between the completed content and the input image, a color correction algorithm is applied to remove the visual artifacts. Numerical results show that our proposed image completion method can achieve photo realistic image completion results.

  • Journal article
    Clerckx B, Kim J, 2018,

    On the beneficial roles of fading and transmit diversity in wireless power transfer with nonlinear energy harvesting

    , IEEE Transactions on Wireless Communications, Vol: 17, Pages: 7731-7743, ISSN: 1536-1276

    We study the effect of channel fading in WirelessPower Transfer (WPT) and show that fading enhances the RF-to-DC conversion efficiency of nonlinear RF energy harvesters.We then develop a new form of signal design for WPT, denoted asTransmit Diversity, that relies on multiple dumb antennas at thetransmitter to induce fast fluctuations of the wireless channel.Those fluctuations boost the RF-to-DC conversion efficiencythanks to the energy harvester nonlinearity. In contrast with(energy) beamforming, Transmit Diversity does not rely onChannel State Information at the Transmitter (CSIT) and doesnot increase the average power at the energy harvester input,though it still enhances the overall end-to-end power transferefficiency. Transmit Diversity is also combined with recentlydeveloped (energy) waveform and modulation to provide furtherenhancements. The efficacy of the scheme is analyzed usingphysics-based and curve fitting-based nonlinear models of the en-ergy harvester and demonstrated using circuit simulations, pro-totyping and experimentation. Measurements with two transmitantennas reveal gains of 50% in harvested DC power over a singletransmit antenna setup. The work (again) highlights the crucialrole played by the harvester nonlinearity and demonstrates thatmultiple transmit antennas can be beneficial to WPT even in theabsence of CSIT.

  • Journal article
    Campello A, Dadush D, Ling C, 2018,

    AWGN-goodness is enough: capacity-achieving lattice codes based on dithered probabilistic shaping

    , IEEE Transactions on Information Theory, ISSN: 0018-9448

    In this paper we show that any sequence of infinite lattice constellations which is good for the unconstrained Gaussian channel can be shaped into a capacity-achieving sequence of codes for the power-constrained Gaussian channel under lattice decoding and non-uniform signalling. Unlike previous results in the literature, our scheme holds with no extra condition on the lattices (e.g. quantization-goodness or vanishing flatness factor), thus establishing a direct implication between AWGNgoodness, in the sense of Poltyrev, and capacity-achieving codes. Our analysis uses properties of the discrete Gaussian distribution in order to obtain precise bounds on the probability of error and achievable rates. In particular, we obtain a simple characterization of the finite-blocklength behavior of the scheme, showing that it approaches the optimal dispersion coefficient for high signalto- noise ratio. We further show that for low signal-to-noise ratio the discrete Gaussian over centered lattice constellations cannot achieve capacity, and thus a shift (or “dither”) is essentially necessary.

  • Journal article
    Oliveira V, Martins R, Liow N, Teiserskas J, von Rosenberg W, Adjei T, Shivamurthappa V, Lally PJ, Mandic D, Thayyil Set al., 2018,

    Prognostic accuracy of heart rate variability analysis in neonatal encephalopathy: a systematic review

    , Neonatology, Vol: 115, Pages: 59-67, ISSN: 1661-7800

    BACKGROUND: Heart rate variability analysis offers real-time quantification of autonomic disturbance after perinatal asphyxia, and may therefore aid in disease stratification and prognostication after neonatal encephalopathy (NE). OBJECTIVE: To systematically review the existing literature on the accuracy of early heart rate variability (HRV) to predict brain injury and adverse neurodevelopmental outcomes after NE. DESIGN/METHODS: We systematically searched the literature published between May 1947 and May 2018. We included all prospective and retrospective studies reporting HRV metrics, within the first 7 days of life in babies with NE, and its association with adverse outcomes (defined as evidence of brain injury on magnetic resonance imaging and/or abnormal neurodevelopment at ≥1 year of age). We extracted raw data wherever possible to calculate the prognostic indices with confidence intervals. RESULTS: We retrieved 379 citations, 5 of which met the criteria. One further study was excluded as it analysed an already-included cohort. The 4 studies provided data on 205 babies, 80 (39%) of whom had adverse outcomes. Prognostic accuracy was reported for 12 different HRV metrics and the area under the curve (AUC) varied between 0.79 and 0.94. The best performing metric reported in the included studies was the relative power of high-frequency band, with an AUC of 0.94. CONCLUSIONS: HRV metrics are a promising bedside tool for early prediction of brain injury and neurodevelopmental outcome in babies with NE. Due to the small number of studies available, their heterogeneity and methodological limitations, further research is needed to refine this tool so that it can be used in clinical practice.

  • Journal article
    Liu T, Stathaki T,

    Faster R-CNN for Robust Pedestrian Detection using Semantic Segmentation Network

    , Frontiers in Neurorobotics
  • Journal article
    Reynolds SC, abrahamsson T, sjostrom PJ, Schultz S, Dragotti PLet al., 2018,

    CosMIC: a consistent metric for spike inference from calcium imaging

    , Neural Computation, Vol: 30, Pages: 2726-2756, ISSN: 0899-7667

    In recent years, the development of algorithms to detect neuronal spiking activity from two-photon calcium imaging data has received much attention. Meanwhile, few researchers have examined the metrics used to assess the similarity of detected spike trains with the ground truth. We highlight the limitations of the two most commonly used metrics, the spike train correlation and success rate, and propose an alternative, which we refer to as CosMIC. Rather than operating on the true and estimated spike trains directly, the proposed metric assesses the similarity of the pulse trains obtained from convolution of the spike trains with a smoothing pulse. The pulse width, which is derived from the statistics of the imaging data, reflects the temporal tolerance of the metric. The final metric score is the size of the commonalities of the pulse trains as a fraction of their average size. Viewed through the lens of set theory, CosMIC resembles a continuous Sørensen-Dice coefficient — an index commonly used to assess the similarity of discrete, presence/absence data. We demonstrate the ability of the proposed metric to discriminate the precision and recall of spike train estimates. Unlike the spike train correlation, which appears to reward overestimation, the proposed metric score is maximised when the correct number of spikes have been detected. Furthermore, we show that CosMIC is more sensitive to the temporal precision of estimates than the success rate.

  • Journal article
    Xue W, Moore A, Brookes DM, Naylor Pet al., 2018,

    Modulation-domain multichannel Kalman filtering for speech enhancement

    , IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol: 26, Pages: 1833-1847, ISSN: 2329-9290

    Compared with single-channel speech enhancement methods, multichannel methods can utilize spatial information to design optimal filters. Although some filters adaptively consider second-order signal statistics, the temporal evolution of the speech spectrum is usually neglected. By using linear prediction (LP) to model the inter-frame temporal evolution of speech, single-channel Kalman filtering (KF) based methods have been developed for speech enhancement. In this paper, we derive a multichannel KF (MKF) that jointly uses both interchannel spatial correlation and interframe temporal correlation for speech enhancement. We perform LP in the modulation domain, and by incorporating the spatial information, derive an optimal MKF gain in the short-time Fourier transform domain. We show that the proposed MKF reduces to the conventional multichannel Wiener filter if the LP information is discarded. Furthermore, we show that, under an appropriate assumption, the MKF is equivalent to a concatenation of the minimum variance distortion response beamformer and a single-channel modulation-domain KF and therefore present an alternative implementation of the MKF. Experiments conducted on a public head-related impulse response database demonstrate the effectiveness of the proposed method.

  • Conference paper
    Deng X, Huang J, Liu M, Dragotti PLet al., 2018,


    , IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 1807-1811
  • Conference paper
    Antonello N, De Sena E, Moonen M, Naylor PA, van Waterschoot Met al., 2018,


    , IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 6892-6896

    In this paper, source localization and dereverberation are formulated jointly as an inverse problem. The inverse problem consists in the interpolation of the sound field measured by a set of microphones by matching the recorded sound pressure with that of a particular acoustic model. This model is based on a collection of equivalent sources creating either spherical or plane waves. In order to achieve meaningful results, spatial, spatio-temporal and spatio-spectral sparsity can be promoted in the signals originating from the equivalent sources. The inverse problem consists of a large-scale optimization problem that is solved using a first order matrix-free optimization algorithm. It is shown that once the equivalent source signals capable of effectively interpolating the sound field are obtained, they can be readily used to localize a speech sound source in terms of Direction of Arrival (DOA) and to perform dereverberation in a highly reverberant environment.

  • Conference paper
    Yiallourides C, Moore AH, Auvinet E, Van der Straeten C, Naylor PAet al., 2018,

    Acoustic Analysis and Assessment of the Knee in Osteoarthritis During Walking

    , IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 281-285

    We examine the relation between the sounds emitted by the knee joint during walking and its condition, with particular focus on osteoarthritis, and investigate their potential for noninvasive detection of knee pathology. We present a comparative analysis of several features and evaluate their discriminant power for the task of normal-abnormal signal classification. We statistically evaluate the feature distributions using the two-sample Kolmogorov-Smirnov test and the Bhattacharyya distance. We propose the use of 11 statistics to describe the distributions and test with several classifiers. In our experiments with 249 normal and 297 abnormal acoustic signals from 40 knees, a Support Vector Machine with linear kernel gave the best results with an error rate of 13.9%.

  • Conference paper
    Li Z, Pei W, Xia Y, Wang K, Mandic DPet al., 2018,


    , IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 4329-4333
  • Conference paper
    Dees BS, Xia Y, Douglas SC, Mandic DPet al., 2018,


    , IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 4339-4343
  • Conference paper
    Huang J-J, Dragotti PL, 2018,


    , IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 6777-6781
  • Journal article
    Cheng H, Xia Y, Huang Y, Yang L, Mandic DPet al., 2018,

    A Normalized Complex LMS Based Blind I/Q Imbalance Compensator for GFDM Receivers and Its Full Second-Order Performance Analysis

    , IEEE TRANSACTIONS ON SIGNAL PROCESSING, Vol: 66, Pages: 4701-4712, ISSN: 1053-587X
  • Journal article
    Shi J, Liu L, Gunduz D, Ling Cet al., 2018,

    Polar Codes and Polar Lattices for the Heegard-Berger Problem

    , IEEE TRANSACTIONS ON COMMUNICATIONS, Vol: 66, Pages: 3760-3771, ISSN: 0090-6778
  • Journal article
    Evers C, Naylor PA, 2018,

    Acoustic SLAM

    , IEEE Transactions on Audio, Speech and Language Processing, Vol: 26, Pages: 1484-1498, ISSN: 1558-7916

    An algorithm is presented that enables devices equipped with microphones, such as robots, to move within their environment in order to explore, adapt to and interact with sound sources of interest. Acoustic scene mapping creates a 3D representation of the positional information of sound sources across time and space. In practice, positional source information is only provided by Direction-of-Arrival (DoA) estimates of the source directions; the source-sensor range is typically difficult to obtain. DoA estimates are also adversely affected by reverberation, noise, and interference, leading to errors in source location estimation and consequent false DoA estimates. Moroever, many acoustic sources, such as human talkers, are not continuously active, such that periods of inactivity lead to missing DoA estimates. Withal, the DoA estimates are specified relative to the observer's sensor location and orientation. Accurate positional information about the observer therefore is crucial. This paper proposes Acoustic Simultaneous Localization and Mapping (aSLAM), which uses acoustic signals to simultaneously map the 3D positions of multiple sound sources whilst passively localizing the observer within the scene map. The performance of aSLAM is analyzed and evaluated using a series of realistic simulations. Results are presented to show the impact of the observer motion and sound source localization accuracy.

  • Journal article
    Clerckx B, Costanzo A, Georgiadis A, Carvalho NBet al., 2018,

    Toward 1G Mobile Power Networks

    , IEEE MICROWAVE MAGAZINE, Vol: 19, Pages: 69-82, ISSN: 1527-3342
  • Conference paper
    Hafezi S, Moore AH, Naylor PA, 2018,


    , 10th IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM), Publisher: IEEE, Pages: 395-399, ISSN: 1551-2282
  • Journal article
    Talebi SP, Werner S, Mandic DP, 2018,

    Distributed Adaptive Filtering of alpha-Stable Signals

    , IEEE SIGNAL PROCESSING LETTERS, Vol: 25, Pages: 1450-1454, ISSN: 1070-9908
  • Journal article
    Xia Y, Douglas SC, Mandic DP, 2018,

    A perspective on CLMS as a deficient length augmented CLMS: Dealing with second order noncircularity

    , SIGNAL PROCESSING, Vol: 149, Pages: 236-245, ISSN: 0165-1684
  • Conference paper
    Moore AH, Lightburn L, Xue W, Naylor P, Brookes Det al.,

    Binaural mask-informed speech enhancement for hearing aids with head tracking

    , International Workshop on Acoustic Signal Enhancement (IWAENC 2018), Publisher: IEEE

    An end-to-end speech enhancement system for hearing aids is pro-posed which seeks to improve the intelligibility of binaural speechin noise during head movement. The system uses a reference beam-former whose look direction is informed by knowledge of the headorientation and the a priori known direction of the desired source.From this a time-frequency mask is estimated using a deep neuralnetwork. The binaural signals are obtained using bilateral beam-formers followed by a classical minimum mean square error speechenhancer, modified to use the estimated mask as a speech presenceprobability prior. In simulated experiments, the improvement in abinaural intelligibility metric (DBSTOI) given by the proposed sys-tem relative to beamforming alone corresponds to an SNR improve-ment of 4 to 6 dB. Results also demonstrate the individual contribu-tions of incorporating the mask and the head orientation-aware beamsteering to the proposed system.

  • Conference paper
    Evers C, Loellmann H, Mellmann H, Schmidt A, Barfuss H, Naylor P, Kellermann Wet al.,

    LOCATA Challenge - Evaluation Tasks and Measures

    , International Workshop on Acoustic Signal Enhancement (IWAENC 2018), Publisher: IEEE

    Sound source localization and tracking algorithms provide estimatesof the positional information about active sound sources in acous-tic environments. Despite substantial advances and significant in-terest in the research community, a comprehensive benchmarkingcampaign of the various approaches using a common database ofaudio recordings has, to date, not been performed. The aim of theIEEE-AASP Challenge on sound source localization and tracking(LOCATA) is to objectively benchmark state-of-the-art localizationand tracking algorithms using an open-access data corpus of record-ings for scenarios typically encountered in audio and acoustic signalprocessing applications. The challenge tasks range from the local-ization of a single source with a static microphone array to trackingof multiple moving sources with a moving microphone array. Thispaper provides an overview of the challenge tasks, describes the per-formance measures used for evaluation of the LOCATA Challenge,and presents baseline results for the development dataset.

  • Journal article
    Constantinescu MA, Lee S-L, Ernst S, Hemakom A, Mandic D, Yang G-Zet al., 2018,

    Probabilistic guidance for catheter tip motion in cardiac ablation procedures

    , Medical Image Analysis, Vol: 47, Pages: 1-14, ISSN: 1361-8415

    Radiofrequency catheter ablation is one of the commonly available therapeutic methods for patients suffering from cardiac arrhythmias. The prerequisite of successful ablation is sufficient energy delivery at the target site. However, cardiac and respiratory motion, coupled with endocardial irregularities, can cause catheter drift and dispersion of the radiofrequency energy, thus prolonging procedure time, damaging adjacent tissue, and leading to electrical reconnection of temporarily ablated regions. Therefore, positional accuracy and stability of the catheter tip during energy delivery is of great importance for the outcome of the procedure. This paper presents an analytical scheme for assessing catheter tip stability, whereby a sequence of catheter tip motion recorded at sparse locations on the endocardium is decomposed. The spatial sliding component along the endocardial wall is extracted from the recording and maximal slippage and its associated probability are computed at each mapping point. Finally, a global map is generated, allowing the assessment of potential areas that are compromised by tip slippage. The proposed framework was applied to 40 retrospective studies of congenital heart disease patients and further validated on phantom data and simulations. The results show a good correlation with other intraoperative factors, such as catheter tip contact force amplitude and orientation, and with clinically documented anatomical areas of high catheter tip instability.

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=403&limit=30&respub-action=search.html Current Millis: 1563653443259 Current Time: Sat Jul 20 21:10:43 BST 2019