Publications
283 results found
Wang Z, Lo PW, Huang Y, et al., 2023, Tactile perception: a biomimetic whisker-based method for clinical gastrointestinal diseases screening, npj Robotics, ISSN: 2731-4278
Ghosh T, McCrory MA, Marden T, et al., 2023, I2N: image to nutrients, a sensor guided semi-automated tool for annotation of images for nutrition analysis of eating episodes, Frontiers in Nutrition, Vol: 10, Pages: 1-9, ISSN: 2296-861X
INTRODUCTION: Dietary assessment is important for understanding nutritional status. Traditional methods of monitoring food intake through self-report such as diet diaries, 24-hour dietary recall, and food frequency questionnaires may be subject to errors and can be time-consuming for the user. METHODS: This paper presents a semi-automatic dietary assessment tool we developed - a desktop application called Image to Nutrients (I2N) - to process sensor-detected eating events and images captured during these eating events by a wearable sensor. I2N has the capacity to offer multiple food and nutrient databases (e.g., USDA-SR, FNDDS, USDA Global Branded Food Products Database) for annotating eating episodes and food items. I2N estimates energy intake, nutritional content, and the amount consumed. The components of I2N are three-fold: 1) sensor-guided image review, 2) annotation of food images for nutritional analysis, and 3) access to multiple food databases. Two studies were used to evaluate the feasibility and usefulness of I2N: 1) a US-based study with 30 participants and a total of 60 days of data and 2) a Ghana-based study with 41 participants and a total of 41 days of data). RESULTS: In both studies, a total of 314 eating episodes were annotated using at least three food databases. Using I2N's sensor-guided image review, the number of images that needed to be reviewed was reduced by 93% and 85% for the two studies, respectively, compared to reviewing all the images. DISCUSSION: I2N is a unique tool that allows for simultaneous viewing of food images, sensor-guided image review, and access to multiple databases in one tool, making nutritional analysis of food images efficient. The tool is flexible, allowing for nutritional analysis of images if sensor signals aren't available.
Gu X, Fani D, Han J, et al., 2023, Beyond supervised learning for pervasive healthcare, IEEE Reviews in Biomedical Engineering, Pages: 1-21, ISSN: 1937-3333
The integration of machine/deep learning and sensing technologies is transforming healthcare and medical practice. However, inherent limitations in healthcare data, namely scarcity , quality , and heterogeneity , hinder the effectiveness of supervised learning techniques which are mainly based on pure statistical fitting between data and labels. In this paper, we first identify the challenges present in machine learning for pervasive healthcare and we then review the current trends beyond fully supervised learning that are developed to address these three issues. Rooted in the inherent drawbacks of empirical risk minimization that underpins pure fully supervised learning, this survey summarizes seven key lines of learning strategies, to promote the generalization performance for real-world deployment. In addition, we point out several directions that are emerging and promising in this area, to develop data-efficient, scalable, and trustworthy computational models, and to leverage multi-modality and multi-source sensing informatics, for pervasive healthcare.
Gu X, Han J, Yang G-Z, et al., 2023, Generalizable movement intention recognition with multiple heterogenous EEG datasets, the 2023 IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE, Pages: 9858-9864
Human movement intention recognition is important for human-robot interaction. Existing work based on motor imagery electroencephalogram (EEG) provides anon-invasive and portable solution for intention detection.However, the data-driven methods may suffer from the limited scale and diversity of the training datasets, which result in poor generalization performance on new test subjects. It is practically difficult to directly aggregate data from multiple datasets for training, since they often employ different channels and collected data suffers from significant domain shifts caused by different devices, experiment setup, etc. On the other hand, the inter-subject heterogeneity is also substantial due to individual differences in EEG representations. In this work, we developed two networks to learn from both the shared and the complete channels across datasets, handlinginter-subject and inter-dataset heterogeneity respectively. Based on both networks, we further developed an online knowledge co-distillation framework to collaboratively learn from both networks, achieving coherent performance boosts. Experimental results have shown that our proposed method can effectively aggregate knowledge from multiple datasets, demonstrating better generalization in the context of cross-subject validation.
Calo J, Lo B, 2023, Federated Blockchain Learning at the Edge, Information (Switzerland), Vol: 14
Machine learning, particularly using neural networks, is now widely adopted in practice even with the IoT paradigm; however, training neural networks at the edge, on IoT devices, remains elusive, mainly due to computational requirements. Furthermore, effective training requires large quantities of data and privacy concerns restrict accessible data. Therefore, in this paper, we propose a method leveraging a blockchain and federated learning to train neural networks at the edge effectively bypassing these issues and providing additional benefits such as distributing training across multiple devices. Federated learning trains networks without storing any data and aggregates multiple networks, trained on unique data, forming a global network via a centralized server. By leveraging the decentralized nature of a blockchain, this centralized server is replaced by a P2P network, removing the need for a trusted centralized server and enabling the learning process to be distributed across participating devices. Our results show that networks trained in such a manner have negligible differences in accuracy compared to traditionally trained networks on IoT devices and are less prone to overfitting. We conclude that not only is this a viable alternative to traditional paradigms but is an improvement that contains a wealth of benefits in an ecosystem such as a hospital.
Li Y, Luo S, Zhang H, et al., 2023, MtCLSS: Multi-Task Contrastive Learning for Semi-Supervised Pediatric Sleep Staging., IEEE J Biomed Health Inform, Vol: 27, Pages: 2647-2655
The continuing increase in the incidence and recognition of children's sleep disorders has heightened the demand for automatic pediatric sleep staging. Supervised sleep stage recognition algorithms, however, are often faced with challenges such as limited availability of pediatric sleep physicians and data heterogeneity. Drawing upon two quickly advancing fields, i.e., semi-supervised learning and self-supervised contrastive learning, we propose a multi-task contrastive learning strategy for semi-supervised pediatric sleep stage recognition, abbreviated as MtCLSS. Specifically, signal-adapted transformations are applied to electroencephalogram (EEG) recordings of the full night polysomnogram, which facilitates the network to improve its representation ability through identifying the transformations. We also introduce an extension of contrastive loss function, thus adapting contrastive learning to the semi-supervised setting. In this way, the proposed framework learns not only task-specific features from a small amount of supervised data, but also extracts general features from signal transformations, improving the model robustness. MtCLSS is evaluated on a real-world pediatric sleep dataset with promising performance (0.80 accuracy, 0.78 F1-score and 0.74 kappa). We also examine its generality on a well-known public dataset. The experimental results demonstrate the effectiveness of the MtCLSS framework for EEG based automatic pediatric sleep staging in very limited labeled data scenarios.
Zhou X, Yang Z, Ren Y, et al., 2023, Modified Bilateral Active Estimation Model: A Learning-Based Solution to the Time Delay Problem in Robotic Tele-Control, IEEE Robotics and Automation Letters, Vol: 8, Pages: 2653-2660
The ubiquitous presence of three types of delay in robotic teleoperation systems, i.e., computation delay, transmission delay, and mechanical delay, is the major factor of the system degradation. It is noticeable that the transmission latency over the communication network shows a periodic trend due to the network flux changing. Accordingly, in our previous work, a neural network-based open-loop approach named Bilateral Active Estimation Model (BAEM) was proposed to compensate for the upcoming transmission delay in a unilateral teleoperation system by sending predicted trajectories as commands. In this letter, a modified version of BAEM (m-BAEM) is proposed to compensate for all these three types of delay explicitly, and a real-time robotic teleoperation system based on Robot Operating System 2 (ROS 2) framework is built to evaluate the performance of the m-BAEM in constant and varying delay scenarios with both pre-defined and human-input trajectories. The results of pre-defined trajectories present the satisfactory performance of the m-BAEM even in the presence of transmission delay up to 1000 miliseconds with large variations. The main limitation of the m-BAEM is that it is yet unable to handle unknown trajectories.
Zhang R, Chen J, Wang Z, et al., 2023, A Step Towards Conditional Autonomy-Robotic Appendectomy, IEEE ROBOTICS AND AUTOMATION LETTERS, Vol: 8, Pages: 2429-2436, ISSN: 2377-3766
Qiu J, Lo FP-W, Gu X, et al., 2023, Egocentric image captioning for privacy-preserved passive dietary intake monitoring, IEEE Transactions on Cybernetics, Vol: PP, Pages: 1-14, ISSN: 1083-4419
Camera-based passive dietary intake monitoring is able to continuously capture the eating episodes of a subject, recording rich visual information, such as the type and volume of food being consumed, as well as the eating behaviors of the subject. However, there currently is no method that is able to incorporate these visual clues and provide a comprehensive context of dietary intake from passive recording (e.g., is the subject sharing food with others, what food the subject is eating, and how much food is left in the bowl). On the other hand, privacy is a major concern while egocentric wearable cameras are used for capturing. In this article, we propose a privacy-preserved secure solution (i.e., egocentric image captioning) for dietary assessment with passive monitoring, which unifies food recognition, volume estimation, and scene understanding. By converting images into rich text descriptions, nutritionists can assess individual dietary intake based on the captions instead of the original images, reducing the risk of privacy leakage from images. To this end, an egocentric dietary image captioning dataset has been built, which consists of in-the-wild images captured by head-worn and chest-worn cameras in field studies in Ghana. A novel transformer-based architecture is designed to caption egocentric dietary images. Comprehensive experiments have been conducted to evaluate the effectiveness and to justify the design of the proposed architecture for egocentric dietary image captioning. To the best of our knowledge, this is the first work that applies image captioning for dietary intake assessment in real-life settings.
Zhang C, Jovanov E, Liao H, et al., 2023, Video Based Cocktail Causal Container for Blood Pressure Classification and Blood Glucose Prediction, IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, Vol: 27, Pages: 1118-1128, ISSN: 2168-2194
Alian A, Zari E, Wang Z, et al., 2023, Current engineering developments for robotic systems in flexible endoscopy, Techniques and Innovations in Gastrointestinal Endoscopy, Vol: 25, Pages: 67-81, ISSN: 2590-0307
The past four decades have seen an increase in the incidence of early-onset gastrointestinal cancer. Because early-stage cancer detection is vital to reduce mortality rate, mass screening colonoscopy provides the most effective prevention strategy. However, conventional endoscopy is a painful and technically challenging procedure that requires sedation and experienced endoscopists to be performed. To overcome the current limitations, technological innovation is needed in colonoscopy. In recent years, researchers worldwide have worked to enhance the diagnostic and therapeutic capabilities of endoscopes. The new frontier of endoscopic interventions is represented by robotic flexible endoscopy. Among all options, self-propelling soft endoscopes are particularly promising thanks to their dexterity and adaptability to the curvilinear gastrointestinal anatomy. For these devices to replace the standard endoscopes, integration with embedded sensors and advanced surgical navigation technologies must be investigated. In this review, the progress in robotic endoscopy was divided into the fundamental areas of design, sensing, and imaging. The article offers an overview of the most promising advancements on these three topics since 2018. Continuum endoscopes, capsule endoscopes, and add-on endoscopic devices were included, with a focus on fluid-driven, tendon-driven, and magnetic actuation. Sensing methods employed for the shape and force estimation of flexible endoscopes were classified into model- and sensor-based approaches. Finally, some key contributions in molecular imaging technologies, artificial neural networks, and software algorithms are described. Open challenges are discussed to outline a path toward clinical practice for the next generation of endoscopic devices.
Jiang S, Strout Z, He B, et al., 2023, Dual Stream Meta Learning for Road Surface Classification and Riding Event Detection on Shared Bikes, IEEE Transactions on Systems, Man, and Cybernetics: Systems, ISSN: 2168-2216
Road surface condition monitoring and bike riding event detection are crucial in densely populated cities for travel efficiency and rider safety. However, most current approaches are either costly, unreliable in different scenarios, or not adaptable in new environments. This article proposes a novel automated approach leveraging widely used shared bikes to intelligently detect road surface conditions and riding events suitable for interactive Internet of Things (IoT) cities. We propose a novel dual stream meta learning approach to solve the reliability problem when bike types for the training and testing are different with a limited set of new samples and the self-adaptive problem when classifying new classes without retraining the model, both via dual stream meta learning. Results demonstrate the feasibility of the proposed IoT-based solution with 98.9% accuracy for road surface conditions and 99.6% accuracy for riding events via the proposed dual stream deep learning method in the conventional scenario. With few samples per class, the proposed method is more reliable than other commonly used approaches in the different-bike scenario (e.g., proposed 92.4% versus random forest 74.6%). In cases of predicting new classes, the algorithm is 95.6% accurate using only one sample per class without explicit training (compared to 78.0% for <inline-formula> <tex-math notation="LaTeX">$ K $</tex-math> </inline-formula>-nearest neighbor). This article proposes a robust IoT framework for smart cities involving road surface conditions and rider events which could be critical for many applications, including city mapping, shared bike rental maintenance and rider performance, and city maintenance services.
Shu Y, Gu X, Yang G-Z, et al., 2022, Revisiting self-supervised constrastive learning for facial expression recognition, British Machine Vision Conference, Publisher: British Machine Vision Association, Pages: 1-14
The success of most advanced facial expression recognition works relies heavily on large-scale annotated datasets. However, it poses great challenges in acquiring clean and consistent annotations for facial expression datasets. On the other hand, self-supervised contrastive learning has gained great popularity due to its simple yet effective instance discrimination training strategy, which can potentially circumvent the annotation issue. Nevertheless, there remain inherent disadvantages of instance-level discrimination, which are even more challenging when faced with complicated facial representations. In this paper, we revisit the use of self-supervised contrastive learning and explore three core strategies to enforce expression-specific representations and to minimize the interferencefrom other facial attributes, such as identity and face styling. Experimental results show that our proposed method outperforms the current state-of-the-art self-supervised learning methods, in terms of both categorical and dimensional facial expression recognition tasks. Our project page: https://claudiashu.github.io/SSLFER.
Lam K, Lo FP-W, An Y, et al., 2022, Deep Learning for Instrument Detection and Assessment of Operative Skill in Surgical Videos, IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, Vol: 4, Pages: 1068-1071
Diao H, Chen C, Liu X, et al., 2022, Real-Time and Cost-Effective Smart Mat System Based on Frequency Channel Selection for Sleep Posture Recognition in IoMT, IEEE INTERNET OF THINGS JOURNAL, Vol: 9, Pages: 21421-21431, ISSN: 2327-4662
Gu X, Guo Y, Li Z, et al., 2022, Tackling long-tailed category distribution under domain shifts, European Conference on Computer Vision (ECCV 2022), Publisher: Springer, Pages: 727-743, ISSN: 0302-9743
Machine learning models fail to perform well on real-world applications when 1) the category distribution P(Y) of the training dataset suffers from long-tailed distribution and 2) the test data is drawn from different conditional distributions P(X|Y). Existing approaches cannot handle the scenario where both issues exist, which however is common for real-world applications. In this study, we took a step forward and looked into the problem of long-tailed classification under domain shifts. We designed three novel core functional blocks including Distribution Calibrated Classification Loss, Visual-Semantic Mapping and Semantic-Similarity Guided Augmentation. Furthermore, we adopted a meta-learning framework which integrates these three blocks to improve domain generalization on unseen target domains. Two new datasets were proposed for this problem, named AWA2-LTS and ImageNet-LTS. We evaluated our method on the two datasets and extensive experimental results demonstrate that our proposed method can achieve superior performance over state-of-the-art long-tailed/domain generalization approaches and the combinations. Source codes and datasets can be found at our project page https://xiaogu.site/LTDS.
Zhang D, Ren Y, Barbot A, et al., 2022, Fabrication and optical manipulation of micro-robots for biomedical applications, MATTER, Vol: 5, Pages: 3135-3160, ISSN: 2590-2393
Qiu J, Chen L, Gu X, et al., 2022, Egocentric Human Trajectory Forecasting With a Wearable Camera and Multi-Modal Fusion, IEEE ROBOTICS AND AUTOMATION LETTERS, Vol: 7, Pages: 8799-8806, ISSN: 2377-3766
Sun Y, Lo FP-W, Lo B, 2022, Light-weight internet-of-things device authentication, encryption and key distribution using end-to-end neural cryptosystems, IEEE Internet of Things Journal, Vol: 9, Pages: 14978-14987, ISSN: 2327-4662
Device authentication, encryption, and key distribution are of vital importance to any Internet-of-Things (IoT) systems, such as the new smart city infrastructures. This is due to the concern that attackers could easily exploit the lack of strong security in IoT devices to gain unauthorized access to the system or to hijack IoT devices to perform denial-of-service attacks on other networks. With the rise of fog and edge computing in IoT systems, increasing numbers of IoT devices have been equipped with computing capabilities to perform data analysis with deep learning technologies. Deep learning on edge devices can be deployed in numerous applications, such as local cardiac arrhythmia detection on a smart sensing patch, but it is rarely applied to device authentication and wireless communication encryption. In this paper, we propose a novel lightweight IoT device authentication, encryption, and key distribution approach using neural cryptosystems and binary latent space. The neural cryptosystems adopt three types of end-to-end encryption schemes: symmetric, public-key, and without keys. A series of experiments were conducted to test the performance and security strength of the proposed neural cryptosystems. The experimental results demonstrate the potential of this novel approach as a promising security and privacy solution for the next-generation of IoT systems.
Yunxiao R, Keshavarz M, Salzitsa A, et al., 2022, Machine Learning-Based Real-Time Localisation and Automatic Trapping of Multiple Microrobots in Optical Tweezer, International Conference on Manipulation, Automation and Robotics at Small Scales (MARSS2022)
Zhang D, Wu Z, Chen J, et al., 2022, Human-robot shared control for surgical robot based on context-aware sim-to-real adaptation, 2022 IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE, Pages: 7701-7707
Human-robot shared control, which integrates the advantages of both humans and robots, is an effective approach to facilitate efficient surgical operation. Learning from demonstration (LfD) techniques can be used to automate some of the surgical sub tasks for the construction of the shared control mechanism. However, a sufficient amount of data is required for the robot to learn the manoeuvres. Using a surgical simulator to collect data is a less resource-demanding approach. With sim-to-real adaptation, the manoeuvres learned from a simulator can be transferred to a physical robot. To this end, we propose a sim-to-real adaptation method to construct a human-robot shared control framework for robotic surgery. In this paper, a desired trajectory is generated from a simulator using LfD method, while dynamic motion primitives (DMP) is used to transfer the desired trajectory from the simulator to the physical robotic platform. Moreover, a role adaptation mechanism is developed such that the robot can adjust its role according to the surgical operation contexts predicted by a neural network model. The effectiveness of the proposed framework is validated on the da Vinci Research Kit (dVRK). Results of the user studies indicated that with the adaptive human-robot shared control framework, the path length of the remote controller, the total clutching number and the task completion time can be reduced significantly. The proposed method outperformed the traditional manual control via teleoperation.
Gil B, Lo B, Yang G-Z, et al., 2022, Smart implanted access port catheter for therapy intervention with pH and lactate biosensors., Materials Today Bio, Vol: 15, Pages: 1-9, ISSN: 2590-0064
Totally implanted access ports (TIAP) are widely used with oncology patients requiring long term central venous access for the delivery of chemotherapeutic agents, infusions, transfusions, blood sample collection and parenteral nutrition. Such devices offer a significant improvement to the quality of life for patients and reduced complication rates, particularly infection, in contrast to the classical central venous catheters. Nevertheless, infections do occur, with biofilm formation bringing difficulties to the treatment of infection-related complications that can ultimately lead to the explantation of the device. A smart TIAP device that is sensor-enabled to detect infection prior to extensive biofilm formation would reduce the cases for potential device explantation, whereas biomarkers detection within body fluids such as pH or lactate would provide vital information regarding metabolic processes occurring inside the body. In this paper, we propose a novel batteryless and wireless device suitable for the interrogation of such markers in an embodiment model of an TIAP, with miniature biochemical sensing needles. Device readings can be carried out by a smartphone equipped with Near Field Communication (NFC) interface at relative short distances off-body, while providing radiofrequency energy harvesting capability to the TIAP, useful for assessing patient's health and potential port infection on demand.
Li Y, Peng C, Zhang Y, et al., 2022, Adversarial learning for semi-supervised pediatric sleep staging with single-EEG channel, METHODS, Vol: 204, Pages: 84-91, ISSN: 1046-2023
- Author Web Link
- Cite
- Citations: 3
Cerminaro C, Sazonov E, McCrory MA, et al., 2022, Feasibility of the automatic ingestion monitor (AIM-2) for infant feeding assessment: a pilot study among breast-feeding mothers from Ghana, PUBLIC HEALTH NUTRITION, Vol: 25, Pages: 2897-2907, ISSN: 1368-9800
Bai W, Cursi F, Guo X, et al., 2022, Task-Based LSTM Kinematic Modeling for a Tendon-Driven Flexible Surgical Robot, IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, Vol: 4, Pages: 339-342
- Author Web Link
- Cite
- Citations: 2
Zhang D, Barbot A, Seichepine F, et al., 2022, Micro-object pose estimation with sim-to-real transfer learning using small dataset, Communications Physics, Vol: 5, ISSN: 2399-3650
Gil B, Anastasova S, Lo B, 2022, Graphene field-effect transistors array for detection of liquid conductivities in the physiological range through novel time- multiplexed impedance measurements, CARBON, Vol: 193, Pages: 394-403, ISSN: 0008-6223
Lam K, Chen J, Wang Z, et al., 2022, Machine learning for technical skill assessment in surgery: a systematic review, npj Digital Medicine, Vol: 5, ISSN: 2398-6352
Accurate and objective performance assessment is essential for both trainees and certified surgeons. However, existing methods can be time consuming, labor intensive and subject to bias. Machine learning (ML) has the potential to provide rapid, automated and reproducible feedback without the need for expert reviewers. We aimed to systematically review the literature and determine the ML techniques used for technical surgical skill assessment and identify challenges and barriers in the field. A systematic literature search, in accordance with the PRISMA statement, was performed to identify studies detailing the use of ML for technical skill assessment in surgery. Of the 1896 studies that were retrieved, 66 studies were included. The most common ML methods used were Hidden Markov Models (HMM, 14/66), Support Vector Machines (SVM, 17/66) and Artificial Neural Networks (ANN, 17/66). 40/66 studies used kinematic data, 19/66 used video or image data, and 7/66 used both. Studies assessed performance of benchtop tasks (48/66), simulator tasks (10/66), and real-life surgery (8/66). Accuracy rates of over 80% were achieved, although tasks and participants varied between studies. Barriers to progress in the field included a focus on basic tasks, lack of standardization between studies, and lack of datasets. ML has the potential to produce accurate and objective surgical skill assessment through the use of methods including HMM, SVM, and ANN. Future ML-based assessment tools should move beyond the assessment ofbasic tasks and towards real-life surgery and provide interpretable feedback with clinical value for the surgeon.
Gu X, Guo Y, Yang G-Z, et al., 2022, Cross-domain self-supervised complete geometric representation learning for real-scanned point cloud based pathological gait analysis, IEEE Journal of Biomedical and Health Informatics, Vol: 26, Pages: 1034-1044, ISSN: 2168-2194
Accurate lower-limb pose estimation is a prereq-uisite of skeleton based pathological gait analysis. To achievethis goal in free-living environments for long-term monitoring,single depth sensor has been proposed in research. However,the depth map acquired from a single viewpoint encodes onlypartial geometric information of the lower limbs and exhibitslarge variations across different viewpoints. Existing off-the-shelfthree-dimensional (3D) pose tracking algorithms and publicdatasets for depth based human pose estimation are mainlytargeted at activity recognition applications. They are relativelyinsensitive to skeleton estimation accuracy, especially at thefoot segments. Furthermore, acquiring ground truth skeletondata for detailed biomechanics analysis also requires consid-erable efforts. To address these issues, we propose a novelcross-domain self-supervised complete geometric representationlearning framework, with knowledge transfer from the unlabelledsynthetic point clouds of full lower-limb surfaces. The proposedmethod can significantly reduce the number of ground truthskeletons (with only 1%) in the training phase, meanwhileensuring accurate and precise pose estimation and capturingdiscriminative features across different pathological gait patternscompared to other methods.
Jia W, Ren Y, Li B, et al., 2022, A Novel Approach to Dining Bowl Reconstruction for Image-Based Food Volume Estimation, SENSORS, Vol: 22
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.