Search or filter publications

Filter by type:

Filter by publication type

Filter by year:



  • Showing results for:
  • Reset all filters

Search results

  • Conference paper
    Neo VW, Weiss S, McKnight S, Hogg A, Naylor PAet al., 2022,

    Polynomial eigenvalue decomposition-based target speaker voice activity detection in the presence of competing talkers

    , 17th International Workshop on Acoustic Signal Enhancement

    Voice activity detection (VAD) algorithms are essential for many speech processing applications, such as speaker diarization, automatic speech recognition, speech enhancement, and speech coding. With a good VAD algorithm, non-speech segments can be excluded to improve the performance and computation of these applications. In this paper, we propose a polynomial eigenvalue decomposition-based target-speaker VAD algorithm to detect unseen target speakers in the presence of competing talkers. The proposed approach uses frame-based processing to compute the syndrome energy, used for testing the presence or absence of a target speaker. The proposed approach is consistently among the best in F1 and balanced accuracy scores over the investigated range of signal to interference ratio (SIR) from -10 dB to 20 dB.

  • Conference paper
    McKnight S, Hogg A, Neo V, Naylor Pet al., 2022,

    A study of salient modulation domain features for speaker identification

    , Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Publisher: IEEE, Pages: 705-712

    This paper studies the ranges of acoustic andmodulation frequencies of speech most relevant for identifyingspeakers and compares the speaker-specific information presentin the temporal envelope against that present in the temporalfine structure. This study uses correlation and feature importancemeasures, random forest and convolutional neural network mod-els, and reconstructed speech signals with specific acoustic and/ormodulation frequencies removed to identify the salient points. Itis shown that the range of modulation frequencies associated withthe fundamental frequency is more important than the 1-16 Hzrange most commonly used in automatic speech recognition, andthat the 0 Hz modulation frequency band contains significantspeaker information. It is also shown that the temporal envelopeis more discriminative among speakers than the temporal finestructure, but that the temporal fine structure still contains usefuladditional information for speaker identification. This researchaims to provide a timely addition to the literature by identifyingspecific aspects of speech relevant for speaker identification thatcould be used to enhance the discriminant capabilities of machinelearning models.

  • Conference paper
    Hogg AOT, Neo VW, Weiss S, Evers C, Naylor PAet al., 2021,

    A Polynomial Eigenvalue Decomposition Music Approach for Broadband Sound Source Localization

    , 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Publisher: IEEE
  • Conference paper
    Hogg AOT, Evers C, Naylor PA, 2021,

    Multichannel Overlapping Speaker Segmentation Using Multiple Hypothesis Tracking Of Acoustic And Spatial Features

    , ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE
  • Conference paper
    Neo VW, Evers C, Naylor PA, 2021,

    Polynomial matrix eigenvalue decomposition of spherical harmonics for speech enhancement

    , IEEE International Conference on Acoustics, Speech and Signal Processing, Publisher: IEEE, Pages: 786-790

    Speech enhancement algorithms using polynomial matrix eigen value decomposition (PEVD) have been shown to be effective for noisy and reverberant speech. However, these algorithms do not scale well in complexity with the number of channels used in the processing. For a spherical microphone array sampling an order-limited sound field, the spherical harmonics provide a compact representation of the microphone signals in the form of eigen beams. We propose a PEVD algorithm that uses only the lower dimension eigen beams for speech enhancement at a significantly lower computation cost. The proposed algorithm is shown to significantly reduce complexity while maintaining full performance. Informal listening examples have also indicated that the processing does not introduce any noticeable artefacts.

  • Journal article
    Hogg A, Evers C, Moore A, Naylor Pet al., 2021,

    Overlapping speaker segmentation using multiple hypothesis tracking of fundamental frequency

    , IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol: 29, Pages: 1479-1490, ISSN: 2329-9290

    This paper demonstrates how the harmonic structure of voiced speech can be exploited to segment multiple overlapping speakers in a speaker diarization task. We explore how a change in the speaker can be inferred from a change in pitch. We show that voiced harmonics can be useful in detecting when more than one speaker is talking, such as during overlapping speaker activity. A novel system is proposed to track multiple harmonics simultaneously, allowing for the determination of onsets and end-points of a speaker’s utterance in the presence of an additional active speaker. This system is bench-marked against a segmentation system from the literature that employs a bidirectional long short term memory network (BLSTM) approach and requires training. Experimental results highlight that the proposed approach outperforms the BLSTM baseline approach by 12.9% in terms of HIT rate for speaker segmentation. We also show that the estimated pitch tracks of our system can be used as features to the BLSTM to achieve further improvements of 1.21% in terms of coverage and 2.45% in terms of purity

  • Conference paper
    Neo VW, Evers C, Naylor PA, 2021,

    Speech dereverberation performance of a polynomial-EVD subspace approach

    , European Signal Processing Conference (EUSIPCO), Publisher: IEEE, ISSN: 2076-1465

    The degradation of speech arising from additive background noise and reverberation affects the performance of important speech applications such as telecommunications, hearing aids, voice-controlled systems and robot audition. In this work, we focus on dereverberation. It is shown that the parameterized polynomial matrix eigenvalue decomposition (PEVD)-based speech enhancement algorithm exploits the lack of correlation between speech and the late reflections to enhance the speech component associated with the direct path and early reflections. The algorithm's performance is evaluated using simulations involving measured acoustic impulse responses and noise from the ACE corpus. The simulations and informal listening examples have indicated that the PEVD-based algorithm performs dereverberation over a range of SNRs without introducing any noticeable processing artefacts.

  • Conference paper
    McKnight SW, Hogg A, Naylor P, 2020,

    Analysis of phonetic dependence of segmentation errors in speaker diarization

    , European Signal Processing Conference (EUSIPCO), Publisher: IEEE, ISSN: 2076-1465

    Evaluation of speaker segmentation and diarization normally makes use of forgiveness collars around ground truth speaker segment boundaries such that estimated speaker segment boundaries with such collars are considered completely correct. This paper shows that the popular recent approach of removing forgiveness collars from speaker diarization evaluation tools can unfairly penalize speaker diarization systems that correctly estimate speaker segment boundaries. The uncertainty in identifying the start and/or end of a particular phoneme means that the ground truth segmentation is not perfectly accurate, and even trained human listeners are unable to identify phoneme boundaries with full consistency. This research analyses the phoneme dependence of this uncertainty, and shows that it depends on (i) whether the phoneme being detected is at the start or end of an utterance and (ii) what the phoneme is, so that the use of a uniform forgiveness collar is inadequate. This analysis is expected to point the way towards more indicative and repeatable assessment of the performance of speaker diarization systems.

  • Conference paper
    Neo VW, Evers C, Naylor PA, 2020,

    PEVD-based speech enhancement in reverberant environments

    , IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE, Pages: 186-190

    The enhancement of noisy speech is important for applications involving human-to-human interactions, such as telecommunications and hearing aids, as well as human-to-machine interactions, such as voice-controlled systems and robot audition. In this work, we focus on reverberant environments. It is shown that, by exploiting the lack of correlation between speech and the late reflections, further noise reduction can be achieved. This is verified using simulations involving actual acoustic impulse responses and noise from the ACE corpus. The simulations show that even without using a noise estimator, our proposed method simultaneously achieves noise reduction, and enhancement of speech quality and intelligibility, in reverberant environments over a wide range of SNRs. Furthermore, informal listening examples highlight that our approach does not introduce any significant processing artefacts such as musical noise.

  • Journal article
    Joudeh H, Clerckx B, 2019,

    On the optimality of treating inter-cell interference as noise in uplink cellular networks

    , IEEE Transactions on Information Theory, Vol: 65, Pages: 7208-7232, ISSN: 0018-9448

    In this paper, we explore the information-theoretic optimality of treating interference as noise (TIN) in cellular networks. We focus on uplink scenarios modeled by the Gaussian interfering multiple access channel (IMAC), comprising K mutually interfering multiple access channels (MACs), each formed by an arbitrary number of transmitters communicating independent messages to one receiver. We define TIN for this setting as a scheme in which each MAC (or cell) performs a power-controlled version of its capacity-achieving strategy, with Gaussian codebooks and successive decoding, while treating interference from all other MACs (i.e. inter-cell interference) as noise. We characterize the generalized degrees-of-freedom (GDoF) region achieved through the proposed TIN scheme, and then identify conditions under which this achievable region is convex without the need for time-sharing. We then tighten these convexity conditions and identify a regime in which the proposed TIN scheme achieves the entire GDoF region of the IMAC and is within a constant gap of the entire capacity region.

  • Journal article
    Kotzagiannidis MS, Dragotti PL, 2019,

    Sampling and reconstruction of sparse signals on circulant graphs – an introduction to graph-FRI

    , Applied and Computational Harmonic Analysis, Vol: 47, Pages: 539-565, ISSN: 1096-603X

    With the objective of employing graphs toward a more generalized theory of signal processing, we present a novel sampling framework for (wavelet-)sparse signals defined on circulant graphs which extends basic properties of Finite Rate of Innovation (FRI) theory to the graph domain, and can be applied to arbitrary graphs via suitable approximation schemes. At its core, the introduced Graph-FRI-framework states that any K-sparse signal on the vertices of a circulant graph can be perfectly reconstructed from its dimensionality-reduced representation in the graph spectral domain, the Graph Fourier Transform (GFT), of minimum size 2K. By leveraging the recently developed theory of e-splines and e-spline wavelets on graphs, one can decompose this graph spectral transformation into the multiresolution low-pass filtering operation with a graph e-spline filter, with subsequent transformation to the spectral graph domain; this allows to infer a distinct sampling pattern, and, ultimately, the structure of an associated coarsened graph, which preserves essential properties of the original, including circularity and, where applicable, the graph generating set.

  • Conference paper
    Hogg AOT, Evers C, Naylor PA, 2019,

    Multiple Hypothesis Tracking for Overlapping Speaker Segmentation

    , 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Publisher: IEEE
  • Conference paper
    Neo VW, Evers C, Naylor PA, 2019,

    Speech Enhancement Using Polynomial Eigenvalue Decomposition

    , 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Publisher: IEEE
  • Journal article
    Kotzagiannidis MS, Dragotti PL, 2019,

    Splines and Wavelets on Circulant Graphs

    , Applied and Computational Harmonic Analysis, Vol: 47, Pages: 481-515, ISSN: 1096-603X

    We present novel families of wavelets and associated filterbanks for the analysis and representation of functions defined on circulant graphs. In this work, we leverage the inherent vanishing moment property of the circulant graph Laplacian operator, and by extension, the e-graph Laplacian, which is established as a parameterization of the former with respect to the degree per node, for the design of vertex-localized and critically-sampled higher-order graph (e-)spline wavelet filterbanks, which can reproduce and annihilate classes of (exponential) polynomial signals on circulant graphs. In addition, we discuss similarities and analogies of the detected properties and resulting constructions with splines and spline wavelets in the Euclidean domain. Ultimately, we consider generalizations to arbitrary graphs in the form of graph approximations, with focus on graph product decompositions. In particular, we proceed to show how the use of graph products facilitates a multi-dimensional extension of the proposed constructions and properties.

  • Conference paper
    Sharma D, Hogg AOT, Wang Y, Nour-Eldin A, Naylor PAet al., 2019,

    Non-Intrusive POLQA Estimation of Speech Quality using Recurrent Neural Networks

    , 2019 27th European Signal Processing Conference (EUSIPCO), Publisher: IEEE
  • Conference paper
    Hogg AOT, Evers C, Naylor PA, 2019,

    Speaker Change Detection Using Fundamental Frequency with Application to Multi-talker Segmentation

    , ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE
  • Conference paper
    Neo VW, Naylor PA, 2019,

    Second Order Sequential Best Rotation Algorithm with Householder Reduction for Polynomial Matrix Eigenvalue Decomposition

    , ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Publisher: IEEE
  • Journal article
    Campello A, Dadush D, Ling C, 2019,

    AWGN-goodness is enough: capacity-achieving lattice codes based on dithered probabilistic shaping

    , IEEE Transactions on Information Theory, Vol: 65, Pages: 1961-1971, ISSN: 0018-9448

    In this paper we show that any sequence of infinite lattice constellations which is good for the unconstrained Gaussian channel can be shaped into a capacity-achieving sequence of codes for the power-constrained Gaussian channel under lattice decoding and non-uniform signalling. Unlike previous results in the literature, our scheme holds with no extra condition on the lattices (e.g. quantization-goodness or vanishing flatness factor), thus establishing a direct implication between AWGNgoodness, in the sense of Poltyrev, and capacity-achieving codes. Our analysis uses properties of the discrete Gaussian distribution in order to obtain precise bounds on the probability of error and achievable rates. In particular, we obtain a simple characterization of the finite-blocklength behavior of the scheme, showing that it approaches the optimal dispersion coefficient for high signalto- noise ratio. We further show that for low signal-to-noise ratio the discrete Gaussian over centered lattice constellations cannot achieve capacity, and thus a shift (or “dither”) is essentially necessary.

  • Journal article
    Moore A, Xue W, Naylor P, Brookes Det al., 2019,

    Noise covariance matrix estimation for rotating microphone arrays

    , IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol: 27, Pages: 519-530, ISSN: 2329-9290

    The noise covariance matrix computed between the signals from a microphone array is used in the design of spatial filters and beamformers with applications in noise suppression and dereverberation. This paper specifically addresses the problem of estimating the covariance matrix associated with a noise field when the array is rotating during desired source activity, as is common in head-mounted arrays. We propose a parametric model that leads to an analytical expression for the microphone signal covariance as a function of the array orientation and array manifold. An algorithm for estimating the model parameters during noise-only segments is proposed and the performance shown to be improved, rather than degraded, by array rotation. The stored model parameters can then be used to update the covariance matrix to account for the effects of any array rotation that occurs when the desired source is active. The proposed method is evaluated in terms of the Frobenius norm of the error in the estimated covariance matrix and of the noise reduction performance of a minimum variance distortionless response beamformer. In simulation experiments the proposed method achieves 18 dB lower error in the estimated noise covariance matrix than a conventional recursive averaging approach and results in noise reduction which is within 0.05 dB of an oracle beamformer using the ground truth noise covariance matrix.

  • Journal article
    Campello A, Ling C, Belfiore J-C, 2018,

    Universal lattice codes for MIMO channels

    , IEEE Transactions on Information Theory, Vol: 64, Pages: 7847-7865, ISSN: 0018-9448

    We propose a coding scheme that achieves the capacity of the compound MIMO channel with algebraic lattices. Our lattice construction exploits the multiplicative structure of number fields and their group of units to absorb ill-conditioned channel realizations. To shape the constellation, a discrete Gaussian distribution over the lattice points is applied. These techniques, along with algebraic properties of the proposed lattices, are then used to construct a sub-optimal de-coupled coding scheme that achieves a constant gap to compound capacity by decoding in a lattice that does not dependent on the channel realization. The gap is characterized in terms of algebraic invariants of the code and is shown to be significantly smaller than previous schemes in the literature. We also exhibit alternative algebraic constructions that achieve the capacity of ergodic (SISO) fading channels.

  • Conference paper
    Moore AH, Lightburn L, Xue W, Naylor P, Brookes Det al., 2018,

    Binaural mask-informed speech enhancement for hearing aids with head tracking

    , International Workshop on Acoustic Signal Enhancement (IWAENC 2018), Publisher: IEEE, Pages: 461-465

    An end-to-end speech enhancement system for hearing aids is pro-posed which seeks to improve the intelligibility of binaural speechin noise during head movement. The system uses a reference beam-former whose look direction is informed by knowledge of the headorientation and the a priori known direction of the desired source.From this a time-frequency mask is estimated using a deep neuralnetwork. The binaural signals are obtained using bilateral beam-formers followed by a classical minimum mean square error speechenhancer, modified to use the estimated mask as a speech presenceprobability prior. In simulated experiments, the improvement in abinaural intelligibility metric (DBSTOI) given by the proposed sys-tem relative to beamforming alone corresponds to an SNR improve-ment of 4 to 6 dB. Results also demonstrate the individual contribu-tions of incorporating the mask and the head orientation-aware beamsteering to the proposed system.

  • Conference paper
    Evers C, Loellmann H, Mellmann H, Schmidt A, Barfuss H, Naylor P, Kellermann Wet al., 2018,

    LOCATA challenge - evaluation tasks and measures

    , International Workshop on Acoustic Signal Enhancement (IWAENC 2018), Publisher: IEEE

    Sound source localization and tracking algorithms provide estimatesof the positional information about active sound sources in acous-tic environments. Despite substantial advances and significant in-terest in the research community, a comprehensive benchmarkingcampaign of the various approaches using a common database ofaudio recordings has, to date, not been performed. The aim of theIEEE-AASP Challenge on sound source localization and tracking(LOCATA) is to objectively benchmark state-of-the-art localizationand tracking algorithms using an open-access data corpus of record-ings for scenarios typically encountered in audio and acoustic signalprocessing applications. The challenge tasks range from the local-ization of a single source with a static microphone array to trackingof multiple moving sources with a moving microphone array. Thispaper provides an overview of the challenge tasks, describes the per-formance measures used for evaluation of the LOCATA Challenge,and presents baseline results for the development dataset.

  • Journal article
    Clerckx B, Kim J, 2018,

    On the beneficial roles of fading and transmit diversity in wireless power transfer with nonlinear energy harvesting

    , IEEE Transactions on Wireless Communications, Vol: 17, Pages: 7731-7743, ISSN: 1536-1276

    We study the effect of channel fading in WirelessPower Transfer (WPT) and show that fading enhances the RF-to-DC conversion efficiency of nonlinear RF energy harvesters.We then develop a new form of signal design for WPT, denoted asTransmit Diversity, that relies on multiple dumb antennas at thetransmitter to induce fast fluctuations of the wireless channel.Those fluctuations boost the RF-to-DC conversion efficiencythanks to the energy harvester nonlinearity. In contrast with(energy) beamforming, Transmit Diversity does not rely onChannel State Information at the Transmitter (CSIT) and doesnot increase the average power at the energy harvester input,though it still enhances the overall end-to-end power transferefficiency. Transmit Diversity is also combined with recentlydeveloped (energy) waveform and modulation to provide furtherenhancements. The efficacy of the scheme is analyzed usingphysics-based and curve fitting-based nonlinear models of the en-ergy harvester and demonstrated using circuit simulations, pro-totyping and experimentation. Measurements with two transmitantennas reveal gains of 50% in harvested DC power over a singletransmit antenna setup. The work (again) highlights the crucialrole played by the harvester nonlinearity and demonstrates thatmultiple transmit antennas can be beneficial to WPT even in theabsence of CSIT.

  • Journal article
    Dragotti P, Huang J, 2018,

    Photo realistic image completion via dense correspondence

    , IEEE Transactions on Image Processing, Vol: 27, Pages: 5234-5247, ISSN: 1057-7149

    In this paper, we propose an image completion algorithm based on dense correspondence between the input image and an exemplar image retrieved from the Internet. Contrary to traditional methods which register two images according to sparse correspondence, in this paper, we propose a hierarchical PatchMatch method that progressively estimates a dense correspondence, which is able to capture small deformations between images. The estimated dense correspondence has usually large occlusion areas that correspond to the regions to be completed. A nearest neighbor field (NNF) interpolation algorithm interpolates a smooth and accurate NNF over the occluded region. Given the calculated NNF, the correct image content from the exemplar image is transferred to the input image. Finally, as there could be a color difference between the completed content and the input image, a color correction algorithm is applied to remove the visual artifacts. Numerical results show that our proposed image completion method can achieve photo realistic image completion results.

  • Journal article
    Stott AE, Kanna S, Mandic DP, 2018,

    Widely linear complex partial least squares for latent subspace regression

    , SIGNAL PROCESSING, Vol: 152, Pages: 350-362, ISSN: 0165-1684
  • Journal article
    Luzzi L, Vehkalahti R, Ling C, 2018,

    Almost universal codes for MIMO wiretap channels

    , IEEE Transactions on Information Theory, Vol: 64, Pages: 7218-7241, ISSN: 0018-9448

    Despite several works on secrecy coding for fading and MIMO wiretap channels from an error probability perspective, the construction of information-theoretically secure codes over such channels remains an open problem. In this paper, we consider a fading wiretap channel model where the transmitter has only partial statistical channel state information. Our channel model includes static channels, i.i.d. block fading channels, and ergodic stationary fading with fast decay of large deviations for the eavesdropper's channel. We extend the flatness factor criterion from the Gaussian wiretap channel to fading and MIMO wiretap channels, and establish a simple design criterion where the normalized product distance/minimum determinant of the lattice and its dual should be maximized simultaneously. Moreover, we propose concrete lattice codes satisfying this design criterion, which are built from algebraic number fields with constant root discriminant in the single-antenna case, and from division algebras centered at such number fields in the multipleantenna case. The proposed lattice codes achieve strong secrecy and semantic security for all rates R <; C b - C e - κ, where C b and C e are Bob and Eve's channel capacities, respectively, and κ is an explicit constant gap. Furthermore, these codes are almost universal in the sense that a fixed code is good for secrecy for a wide range of fading models. Finally, we consider a compound wiretap model with a more restricted uncertainty set, and show that rates R <; C̅ b - C̅ e - κ are achievable, where C̅ b is a lower bound for Bob's capacity and C̅ e is an upper bound for Eve's capacity for all

  • Conference paper
    Leung KK, Wang S, Tuor T, Salonidis T, Makaya C, He T, Chan Ket al., 2018,

    When edge meets learning: adaptive control for resource-constrained distributed machine learning

    , IEEE Infocom 2018, Publisher: IEEE

    Emerging technologies and applications includingInternet of Things (IoT), social networking, and crowd-sourcinggenerate large amounts of data at the network edge. Machinelearning models are often built from the collected data, to enablethe detection, classification, and prediction of future events.Due to bandwidth, storage, and privacy concerns, it is oftenimpractical to send all the data to a centralized location. In thispaper, we consider the problem of learning model parametersfrom data distributed across multiple edge nodes, without sendingraw data to a centralized place. Our focus is on a generic classof machine learning models that are trained using gradient-descent based approaches. We analyze the convergence rate ofdistributed gradient descent from a theoretical point of view,based on which we propose a control algorithm that determinesthe best trade-off between local update and global parameteraggregation to minimize the loss function under a given resourcebudget. The performance of the proposed algorithm is evaluatedvia extensive experiments with real datasets, both on a networkedprototype system and in a larger-scale simulated environment.The experimentation results show that our proposed approachperforms near to the optimum with various machine learningmodels and different data distributions.

  • Journal article
    Oliveira V, Martins R, Liow N, Teiserskas J, von Rosenberg W, Adjei T, Shivamurthappa V, Lally PJ, Mandic D, Thayyil Set al., 2018,

    Prognostic accuracy of heart rate variability analysis in neonatal encephalopathy: a systematic review

    , Neonatology, Vol: 115, Pages: 59-67, ISSN: 1661-7800

    BACKGROUND: Heart rate variability analysis offers real-time quantification of autonomic disturbance after perinatal asphyxia, and may therefore aid in disease stratification and prognostication after neonatal encephalopathy (NE). OBJECTIVE: To systematically review the existing literature on the accuracy of early heart rate variability (HRV) to predict brain injury and adverse neurodevelopmental outcomes after NE. DESIGN/METHODS: We systematically searched the literature published between May 1947 and May 2018. We included all prospective and retrospective studies reporting HRV metrics, within the first 7 days of life in babies with NE, and its association with adverse outcomes (defined as evidence of brain injury on magnetic resonance imaging and/or abnormal neurodevelopment at ≥1 year of age). We extracted raw data wherever possible to calculate the prognostic indices with confidence intervals. RESULTS: We retrieved 379 citations, 5 of which met the criteria. One further study was excluded as it analysed an already-included cohort. The 4 studies provided data on 205 babies, 80 (39%) of whom had adverse outcomes. Prognostic accuracy was reported for 12 different HRV metrics and the area under the curve (AUC) varied between 0.79 and 0.94. The best performing metric reported in the included studies was the relative power of high-frequency band, with an AUC of 0.94. CONCLUSIONS: HRV metrics are a promising bedside tool for early prediction of brain injury and neurodevelopmental outcome in babies with NE. Due to the small number of studies available, their heterogeneity and methodological limitations, further research is needed to refine this tool so that it can be used in clinical practice.

  • Journal article
    Liu T, Stathaki T, 2018,

    Faster R-CNN for Robust Pedestrian Detection using Semantic Segmentation Network

    , Frontiers in Neurorobotics
  • Journal article
    Xue W, Moore A, Brookes DM, Naylor Pet al., 2018,

    Modulation-domain multichannel Kalman filtering for speech enhancement

    , IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol: 26, Pages: 1833-1847, ISSN: 2329-9290

    Compared with single-channel speech enhancement methods, multichannel methods can utilize spatial information to design optimal filters. Although some filters adaptively consider second-order signal statistics, the temporal evolution of the speech spectrum is usually neglected. By using linear prediction (LP) to model the inter-frame temporal evolution of speech, single-channel Kalman filtering (KF) based methods have been developed for speech enhancement. In this paper, we derive a multichannel KF (MKF) that jointly uses both interchannel spatial correlation and interframe temporal correlation for speech enhancement. We perform LP in the modulation domain, and by incorporating the spatial information, derive an optimal MKF gain in the short-time Fourier transform domain. We show that the proposed MKF reduces to the conventional multichannel Wiener filter if the LP information is discarded. Furthermore, we show that, under an appropriate assumption, the MKF is equivalent to a concatenation of the minimum variance distortion response beamformer and a single-channel modulation-domain KF and therefore present an alternative implementation of the MKF. Experiments conducted on a public head-related impulse response database demonstrate the effectiveness of the proposed method.

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-t4-html.jsp Query String: id=403&limit=30&respub-action=search.html Current Millis: 1660516291162 Current Time: Sun Aug 14 23:31:31 BST 2022