Imperial College London

DrBennyLo

Faculty of MedicineDepartment of Surgery & Cancer

Reader
 
 
 
//

Contact

 

+44 (0)20 7594 0806benny.lo Website

 
 
//

Location

 

B414BBessemer BuildingSouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

265 results found

Shu Y, Gu X, Yang G-Z, Lo Bet al., 2022, Revisiting self-supervised constrastive learning for facial expression recognition, British Machine Vision Conference, Publisher: British Machine Vision Association, Pages: 1-14

The success of most advanced facial expression recognition works relies heavily on large-scale annotated datasets. However, it poses great challenges in acquiring clean and consistent annotations for facial expression datasets. On the other hand, self-supervised contrastive learning has gained great popularity due to its simple yet effective instance discrimination training strategy, which can potentially circumvent the annotation issue. Nevertheless, there remain inherent disadvantages of instance-level discrimination, which are even more challenging when faced with complicated facial representations. In this paper, we revisit the use of self-supervised contrastive learning and explore three core strategies to enforce expression-specific representations and to minimize the interferencefrom other facial attributes, such as identity and face styling. Experimental results show that our proposed method outperforms the current state-of-the-art self-supervised learning methods, in terms of both categorical and dimensional facial expression recognition tasks. Our project page: https://claudiashu.github.io/SSLFER.

Conference paper

Zhang C, Jovanov E, Liao H, Zhang Y-T, Lo B, Zhang Y, Guan Cet al., 2022, Video Based Cocktail Causal Container for Blood Pressure Classification and Blood Glucose Prediction., IEEE J Biomed Health Inform, Vol: PP

With the development of modern cameras, more physiological signals can be obtained from portable devices like smartphone. Some hemodynamically based non-invasive video processing applications have been applied for blood pressure classification and blood glucose prediction objectives for unobtrusive physiological monitoring at home. However, this approach is still under development with very few publications. In this paper, we propose an end-to-end framework, entitled cocktail causal container, to fuse multiple physiological representations and to reconstruct the correlation between frequency and temporal information during multi-task learning. Cocktail causal container processes hematologic reflex information to classify blood pressure and blood glucose. Since the learning of discriminative features from video physiological representations is quite challenging, we propose a token feature fusion block to fuse the multi-view fine-grained representations to a union discrete frequency space. A causal net is used to analyze the fused higher-order information, so that the framework can be enforced to disentangle the latent factors into the related endogenous association that corresponds to down-stream fusion information to improve the semantic interpretation. Moreover, a pair-wise temporal frequency map is developed to provide valuable insights into extraction of salient photoplethysmograph (PPG) information from fingertip videos obtained by a standard smartphone camera. Extensive comparisons have been implemented for the validation of cocktail causal container using a Clinical dataset and PPG-BP benchmark. The root mean square error of 1.329±0.167 for blood glucose prediction and precision of 0.89±0.03 for blood pressure classification are achieved in Clinical dataset.

Journal article

Diao H, Chen C, Liu X, Yuan W, Amara A, Tamura T, Lo B, Fan J, Meng L, Pun SH, Zhang Y-T, Chen Wet al., 2022, Real-Time and Cost-Effective Smart Mat System Based on Frequency Channel Selection for Sleep Posture Recognition in IoMT, IEEE INTERNET OF THINGS JOURNAL, Vol: 9, Pages: 21421-21431, ISSN: 2327-4662

Journal article

Gu X, Guo Y, Li Z, Qiu J, Dou Q, Liu Y, Lo B, Yang G-Zet al., 2022, Tackling long-tailed category distribution under domain shifts, European Conference on Computer Vision (ECCV 2022), Publisher: Springer, ISSN: 0302-9743

Machine learning models fail to perform well on real-worldapplications when 1) the category distribution P (Y ) of the trainingdataset suffers from long-tailed distribution and 2) the test data is drawnfrom different conditional distributions P (X|Y ). Existing approachescannot handle the scenario where both issues exist, which however iscommon for real-world applications. In this study, we took a step for-ward and looked into the problem of long-tailed classification under do-main shifts. We designed three novel core functional blocks including Dis-tribution Calibrated Classification Loss, Visual-Semantic Mapping andSemantic-Similarity Guided Augmentation. Furthermore, we adopted ameta-learning framework which integrates these three blocks to improvedomain generalization on unseen target domains. Two new datasets wereproposed for this problem, named AWA2-LTS and ImageNet-LTS. Weevaluated our method on the two datasets and extensive experimen-tal results demonstrate that our proposed method can achieve superiorperformance over state-of-the-art long-tailed/domain generalization ap-proaches and the combinations. Source codes and datasets can be foundat our project page https://xiaogu.site/LTDS.

Conference paper

Zhang D, Ren Y, Barbot A, Seichepine F, Lo B, Ma Z-C, Yang G-Zet al., 2022, Fabrication and optical manipulation of micro-robots for biomedical applications, MATTER, Vol: 5, Pages: 3135-3160, ISSN: 2590-2393

Journal article

Qiu J, Chen L, Gu X, Lo FP-W, Tsai Y-Y, Sun J, Liu J, Lo Bet al., 2022, Egocentric Human Trajectory Forecasting With a Wearable Camera and Multi-Modal Fusion, IEEE ROBOTICS AND AUTOMATION LETTERS, Vol: 7, Pages: 8799-8806, ISSN: 2377-3766

Journal article

Sun Y, Lo FP-W, Lo B, 2022, Light-weight internet-of-things device authentication, encryption and key distribution using end-to-end neural cryptosystems, IEEE Internet of Things Journal, Vol: 9, Pages: 14978-14987, ISSN: 2327-4662

Device authentication, encryption, and key distribution are of vital importance to any Internet-of-Things (IoT) systems, such as the new smart city infrastructures. This is due to the concern that attackers could easily exploit the lack of strong security in IoT devices to gain unauthorized access to the system or to hijack IoT devices to perform denial-of-service attacks on other networks. With the rise of fog and edge computing in IoT systems, increasing numbers of IoT devices have been equipped with computing capabilities to perform data analysis with deep learning technologies. Deep learning on edge devices can be deployed in numerous applications, such as local cardiac arrhythmia detection on a smart sensing patch, but it is rarely applied to device authentication and wireless communication encryption. In this paper, we propose a novel lightweight IoT device authentication, encryption, and key distribution approach using neural cryptosystems and binary latent space. The neural cryptosystems adopt three types of end-to-end encryption schemes: symmetric, public-key, and without keys. A series of experiments were conducted to test the performance and security strength of the proposed neural cryptosystems. The experimental results demonstrate the potential of this novel approach as a promising security and privacy solution for the next-generation of IoT systems.

Journal article

Li Y, Peng C, Zhang Y, Zhang Y, Lo Bet al., 2022, Adversarial learning for semi-supervised pediatric sleep staging with single-EEG channel, METHODS, Vol: 204, Pages: 84-91, ISSN: 1046-2023

Journal article

Zhang D, Wu Z, Chen J, Zhu R, Munawar A, Xiao B, Guan Y, Su H, Hong W, Guo Y, Fischer GS, Lo B, Yang G-Zet al., 2022, Human-robot shared control for surgical robot based on context-aware sim-to-real adaptation, 2022 IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE, Pages: 7701-7707

Human-robot shared control, which integrates the advantages of both humans and robots, is an effective approach to facilitate efficient surgical operation. Learning from demonstration (LfD) techniques can be used to automate some of the surgical sub tasks for the construction of the shared control mechanism. However, a sufficient amount of data is required for the robot to learn the manoeuvres. Using a surgical simulator to collect data is a less resource-demanding approach. With sim-to-real adaptation, the manoeuvres learned from a simulator can be transferred to a physical robot. To this end, we propose a sim-to-real adaptation method to construct a human-robot shared control framework for robotic surgery. In this paper, a desired trajectory is generated from a simulator using LfD method, while dynamic motion primitives (DMP) is used to transfer the desired trajectory from the simulator to the physical robotic platform. Moreover, a role adaptation mechanism is developed such that the robot can adjust its role according to the surgical operation contexts predicted by a neural network model. The effectiveness of the proposed framework is validated on the da Vinci Research Kit (dVRK). Results of the user studies indicated that with the adaptive human-robot shared control framework, the path length of the remote controller, the total clutching number and the task completion time can be reduced significantly. The proposed method outperformed the traditional manual control via teleoperation.

Conference paper

Gil B, Lo B, Yang G-Z, Anastasova Set al., 2022, Smart implanted access port catheter for therapy intervention with pH and lactate biosensors., Materials Today Bio, Vol: 15, Pages: 1-9, ISSN: 2590-0064

Totally implanted access ports (TIAP) are widely used with oncology patients requiring long term central venous access for the delivery of chemotherapeutic agents, infusions, transfusions, blood sample collection and parenteral nutrition. Such devices offer a significant improvement to the quality of life for patients and reduced complication rates, particularly infection, in contrast to the classical central venous catheters. Nevertheless, infections do occur, with biofilm formation bringing difficulties to the treatment of infection-related complications that can ultimately lead to the explantation of the device. A smart TIAP device that is sensor-enabled to detect infection prior to extensive biofilm formation would reduce the cases for potential device explantation, whereas biomarkers detection within body fluids such as pH or lactate would provide vital information regarding metabolic processes occurring inside the body. In this paper, we propose a novel batteryless and wireless device suitable for the interrogation of such markers in an embodiment model of an TIAP, with miniature biochemical sensing needles. Device readings can be carried out by a smartphone equipped with Near Field Communication (NFC) interface at relative short distances off-body, while providing radiofrequency energy harvesting capability to the TIAP, useful for assessing patient's health and potential port infection on demand.

Journal article

Cerminaro C, Sazonov E, McCrory MA, Steiner-Asiedu M, Bhaskar V, Gallo S, Laing E, Jia W, Sun M, Baranowski T, Frost G, Lo B, Anderson AKet al., 2022, Feasibility of the automatic ingestion monitor (AIM-2) for infant feeding assessment: a pilot study among breast-feeding mothers from Ghana, PUBLIC HEALTH NUTRITION, Vol: 25, Pages: 2897-2907, ISSN: 1368-9800

Journal article

Bai W, Cursi F, Guo X, Huang B, Lo B, Yang GZ, Yeatman EMet al., 2022, Task-Based LSTM Kinematic Modeling for a Tendon-Driven Flexible Surgical Robot, IEEE Transactions on Medical Robotics and Bionics, Vol: 4, Pages: 339-342

Tendon-driven flexible surgical robots are normally suffering from the inaccurate modeling and imprecise motion control problems due to the nonlinearities of tendon transmission. Learning-based approaches are experimental data-driven with uncertainties modeled empirically, which can be adopted to improve the inevitable issues. This work proposes a LSTM-based kinematic modeling approach with task-based data for a flexible tendon-driven surgical robot to improve the control accuracy. Real experiments demonstrated the effectiveness and superiority of the proposed learned model when completing path following tasks, especially compared to the traditional modeling.

Journal article

Zhang D, Barbot A, Seichepine F, Lo FP-W, Bai W, Yang G-Z, Lo Bet al., 2022, Micro-object pose estimation with sim-to-real transfer learning using small dataset, Communications Physics, Vol: 5, ISSN: 2399-3650

Journal article

Lam K, Chen J, Wang Z, Iqbal F, Darzi A, Lo B, Purkayastha S, Kinross Jet al., 2022, Machine learning for technical skill assessment in surgery: a systematic review, npj Digital Medicine, Vol: 5, ISSN: 2398-6352

Accurate and objective performance assessment is essential for both trainees and certified surgeons. However, existing methods can be time consuming, labor intensive and subject to bias. Machine learning (ML) has the potential to provide rapid, automated and reproducible feedback without the need for expert reviewers. We aimed to systematically review the literature and determine the ML techniques used for technical surgical skill assessment and identify challenges and barriers in the field. A systematic literature search, in accordance with the PRISMA statement, was performed to identify studies detailing the use of ML for technical skill assessment in surgery. Of the 1896 studies that were retrieved, 66 studies were included. The most common ML methods used were Hidden Markov Models (HMM, 14/66), Support Vector Machines (SVM, 17/66) and Artificial Neural Networks (ANN, 17/66). 40/66 studies used kinematic data, 19/66 used video or image data, and 7/66 used both. Studies assessed performance of benchtop tasks (48/66), simulator tasks (10/66), and real-life surgery (8/66). Accuracy rates of over 80% were achieved, although tasks and participants varied between studies. Barriers to progress in the field included a focus on basic tasks, lack of standardization between studies, and lack of datasets. ML has the potential to produce accurate and objective surgical skill assessment through the use of methods including HMM, SVM, and ANN. Future ML-based assessment tools should move beyond the assessment ofbasic tasks and towards real-life surgery and provide interpretable feedback with clinical value for the surgeon.

Journal article

Gu X, Guo Y, Yang G-Z, Lo Bet al., 2022, Cross-domain self-supervised complete geometric representation learning for real-scanned point cloud based pathological gait analysis, IEEE Journal of Biomedical and Health Informatics, Vol: 26, Pages: 1034-1044, ISSN: 2168-2194

Accurate lower-limb pose estimation is a prereq-uisite of skeleton based pathological gait analysis. To achievethis goal in free-living environments for long-term monitoring,single depth sensor has been proposed in research. However,the depth map acquired from a single viewpoint encodes onlypartial geometric information of the lower limbs and exhibitslarge variations across different viewpoints. Existing off-the-shelfthree-dimensional (3D) pose tracking algorithms and publicdatasets for depth based human pose estimation are mainlytargeted at activity recognition applications. They are relativelyinsensitive to skeleton estimation accuracy, especially at thefoot segments. Furthermore, acquiring ground truth skeletondata for detailed biomechanics analysis also requires consid-erable efforts. To address these issues, we propose a novelcross-domain self-supervised complete geometric representationlearning framework, with knowledge transfer from the unlabelledsynthetic point clouds of full lower-limb surfaces. The proposedmethod can significantly reduce the number of ground truthskeletons (with only 1%) in the training phase, meanwhileensuring accurate and precise pose estimation and capturingdiscriminative features across different pathological gait patternscompared to other methods.

Journal article

Jia W, Ren Y, Li B, Beatrice B, Que J, Cao S, Wu Z, Mao Z-H, Lo B, Anderson AK, Frost G, McCrory MA, Sazonov E, Steiner-Asiedu M, Baranowski T, Burke LE, Sun Met al., 2022, A Novel Approach to Dining Bowl Reconstruction for Image-Based Food Volume Estimation, SENSORS, Vol: 22

Journal article

Lam K, Lo FPW, An Y, Darzi A, Kinross JM, Purkayastha S, Lo Bet al., 2022, Deep learning for instrument detection and assessment of operative skill in surgical videos, IEEE Transactions on Medical Robotics and Bionics

Surgical performance has been shown to be directly related to patient outcomes. There is significant variation in surgical performance and therefore a need to measure operative skill accurately and reliably. Despite this, current means of surgical performance assessment rely on expert observation which is labor-intensive, prone to rater bias and unreliable. We present an automatic approach to surgical performance assessment through the tracking of instruments in endoscopic video. We annotate the spatial bounds of surgical instruments in 2600 images and use this new dataset to train Mask R-CNN, a state-of-the-art instance segmentation framework. We show that we can successfully achieve spatial detection of surgical instruments by generating a pixel-by-pixel mask over the detected instrument and achieving an overall mAP of 0.839 for an IoU of 0.5. We leverage the results from our instrument detection framework to assess surgical performance through the generation of instrument trajectory maps and instrument metrics such as moving distance, smoothness of instrument movement and concentration of instrument movement.

Journal article

Lo FPW, Guo Y, Sun Y, Qiu J, Lo Bet al., 2022, An Intelligent Vision-Based Nutritional Assessment Method for Handheld Food Items, IEEE Transactions on Multimedia, ISSN: 1520-9210

Dietary assessment has proven to be an important tool to evaluate the dietary intake of patients with diabetes and obesity. The traditional approach of accessing the dietary intake is to conduct a 24-hour dietary recall (24HR), a structured interview designed to obtain information on food categories and volume consumed by the participants. Due to the unconscious biases in this kind of self-reporting approach, many research studies have explored the use of vision-based approaches to provide accurate and objective assessments. Despite the promising results of food recognition by deep neural networks, there still exist several hurdles in deep learning-based food volume estimation ranging from domain shift between synthetic and raw 3D models, shape completion ambiguity and, most importantly, lack of large-scale paired training dataset. Therefore, this paper proposed an intelligent nutritional assessment approach via weakly-supervised point cloud completion. It aims to close the reality gap in 3D point cloud completion tasks and address the targeted challenges by proposing the Point Cloud Auxiliary Classifier GAN (PC-ACGAN), a novel and robust point cloud completion architecture that can be trained in a weakly-supervised manner. Then the volume can be easily estimated from the completed representation of the food. In addition to advantages with regard to efficiency, another major merit of our system is that it can be used to estimate the volume of handheld food items without requiring the constraints including placing the food items on a table or next to fiducial markers, which facilitate the implementation on both wearable and handheld cameras. Comprehensive experiments have been carried out on major benchmark datasets and self-constructed volume-annotated dataset respectively, in which the proposed PC-ACGAN demonstrates comparable results with several strong fully-supervised baseline methods and shows superior completion ability in handling food volume estimation.

Journal article

Li Y, Luo S, Zhang H, Zhang Y, Zhang Y, Lo Bet al., 2022, MtCLSS: Multi-Task Contrastive Learning for Semi-Supervised Pediatric Sleep Staging, IEEE Journal of Biomedical and Health Informatics, Pages: 1-9, ISSN: 2168-2194

Journal article

Qiu J, Lo FP-W, Gu X, Sun Y, Jiang S, Lo Bet al., 2021, Indoor future person localization from an egocentric wearable camera, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 8586-8592

Accurate prediction of future person location and movement trajectory from an egocentric wearable camera can benefit a wide range of applications, such as assisting visually impaired people in navigation, and the development of mobility assistance for people with disability. In this work, a new egocentric dataset was constructed using a wearable camera, with 8,250 short clips of a targeted person either walking 1) toward, 2) away, or 3) across the camera wearer in indoor environments, or 4) staying still in the scene, and 13,817 person bounding boxes were manually labelled. Apart from the bounding boxes, the dataset also contains the estimated pose of the targeted person as well as the IMU signal of the wearable camera at each time point. An LSTM-based encoder-decoder framework was designed to predict the future location and movement trajectory of the targeted person in this egocentric setting. Extensive experiments have been conducted on the new dataset, and have shown that the proposed method is able to reliably and better predict future person location and trajectory in egocentric videos captured by the wearable camera compared to three baselines.

Conference paper

Zhang D, Wang R, Lo B, 2021, Surgical gesture recognition based on bidirectional multi-layer independently RNN with explainable spatial feature extraction, IEEE International Conference on Robotics and Automation (ICRA) 2021, Publisher: IEEE, Pages: 1350-1356

Minimally invasive surgery mainly consists of a series of sub-tasks, which can be decomposed into basic gestures or contexts. As a prerequisite of autonomic operation, surgical gesture recognition can assist motion planning and decision-making, and build up context-aware knowledge to improve the surgical robot control quality. In this work, we aim to develop an effective surgical gesture recognition approach with an explainable feature extraction process. A Bidirectional Multi-Layer independently RNN (BML-indRNN) model is proposed in this paper, while spatial feature extraction is implemented via fine-tuning of a Deep Convolutional Neural Network (DCNN) model constructed based on the VGG architecture. To eliminate the black-box effects of DCNN, Gradient-weighted Class Activation Mapping (Grad-CAM) is employed. It can provide explainable results by showing the regions of the surgical images that have a strong relationship with the surgical gesture classification results. The proposed method was evaluated based on the suturing task with data obtained from the public available JIGSAWS database. Comparative studies were conducted to verify the pro-posed framework. Results indicated that the testing accuracy for the suturing task based on our proposed method is 87.13%,which outperforms most of the state-of-the-art algorithms

Conference paper

Han J, Gu X, Lo B, 2021, Semi-supervised contrastive learning for generalizable motor imagery eeg classification, 17th IEEE International Conference on Wearable and Implantable Body Sensor Networks, Publisher: IEEE

Electroencephalography (EEG) is one of the most widely used brain-activity recording methods in non-invasive brain-machine interfaces (BCIs). However, EEG data is highly nonlinear, and its datasets often suffer from issues such as data heterogeneity, label uncertainty and data/label scarcity. To address these, we propose a domain independent, end-to-end semi-supervised learning framework with contrastive learning and adversarial training strategies. Our method was evaluated in experiments with different amounts of labels and an ablation study in a motor imagery EEG dataset. The experiments demonstrate that the proposed framework with two different backbone deep neural networks show improved performance over their supervised counterparts under the same condition.

Conference paper

Wang Y, Lo B, 2021, A soft inflatable elbow-assistive robot for children with cerebral palsy

Cerebral palsy can severely impair children's motor function and leading to permanent disability. Compared to adults, children are more vulnerable and susceptible to external harm. Wearable robotics gained much attention in rehabilitation, and has shown its potential in supporting the recovery of people with motor dysfunctions. Conventional adult-oriented wearable assistive robots are tendon-driven whereas the force and inertia generated is too large for children, which could injure children. To address this issue, this paper proposes a novel soft inflatable robot that can aid children in elbow movement whilst minimising the risk of harm. Thermoplastic Polyurethane (TPU) and pneumatic actuation were used in developing the soft robot. From the experiment, the maximum bending angle is 142.2°, with the maximum moment generated being 0.784 Nm, which is suitable for the needed elbow support for young children with cerebral palsy.

Conference paper

Pavlidou A, Lo B, 2021, Artificial ear - A wearable device for the hearing impaired

Hearing aid devices have been around for decades, and one of the most recent approaches is the cochlear implant which is designed for patients with severe hearing loss. This paper introduces the design of a haptic-signal based hearing aid targeting patients suffering from inner ear malfunction, for which the conventional assistive hearing devices will not suffice. This device is designed to record the incoming sound, filter and analyze it into its harmonics, and classify it into the phonemes. The output is transfer into tactile feedback with vibrating motors and each phoneme will activate the respective combination of them.

Conference paper

Rosa BG, Anastasova S, Lo B, 2021, Small-form wearable device for long-term monitoring of cardiac sounds on the body surface

Sound monitoring from sources inside the human body can have important diagnostic relevance in medicine. Cardiac sounds originated from the pumping activity of the heart structure is such an example, with valuable cardiovascular parameters being extracted from the signal, including heart rate (HR) and the systolic intervals. Novel non-invasive methods for early detection of potential life-threatening risks convoyed by unbalanced cardiovascular parameters are essential to reduce the mortality rates associated with cardiac diseases nowadays. In this paper, we propose a small-form wearable device for long-term monitoring of the cardiac sounds through a miniaturized microphone in contact with the body surface at specific locations, which extend from the chest region to the upper and lower body parts. Powered by battery, the device can measure signals for a consecutive period of 28 h in continuous recording mode that is extensive up to 7 days in discontinuous mode, achieving signal amplitude resolution of 0.81 μV and optimal bandwidth between 5 to 20 Hz (infrasound range). The proposed device was able to detect cardiac sound patterns in locations as distant as the forehead, wrist, or ankle, thus paving the way to the use of acoustic signals for wearable heartbeat estimators still relying on optical or bio-potential methods, while replacing the obtrusive and expensive cardiography equipment dedicated to the estimation of the systolic intervals directly from the chest.

Conference paper

Hu M, Kassanos P, Keshavarz M, Yeatman E, Lo Bet al., 2021, Electrical and Mechanical Characterization of Carbon-Based Elastomeric Composites for Printed Sensors and Electronics

Printing technologies have attracted significant interest in recent years, particularly for the development of flexible and stretchable electronics and sensors. Conductive elastomeric composites are a popular choice for these new generations of devices. This paper examines the electrical and mechanical properties of elastomeric composites of polydimethylsiloxane (PDMS), an insulating elastomer, with carbon-based fillers (graphite powder and various types of carbon black, CB), as a function of their composition. The results can direct the choice of material composition to address specific device and application requirements. Molding and stencil printing are used to demonstrate their use.

Conference paper

Yang X, Zhang Y, Lo B, Wu D, Liao H, Zhang Y-Tet al., 2021, DBAN: Adversarial Network With Multi-Scale Features for Cardiac MRI Segmentation, IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, Vol: 25, Pages: 2018-2028, ISSN: 2168-2194

Journal article

Jiang S, Kang P, Song X, Lo B, Shull PBet al., 2021, Emerging wearable interfaces and algorithms for hand gesture recognition: a survey., IEEE Reviews in Biomedical Engineering, Vol: PP, ISSN: 1941-1189

Hands are vital in a wide range of fundamental daily activities, and neurological diseases that impede hand function can significantly affect quality of life. Wearable hand gesture interfaces hold promise to restore and assist hand function and to enhance human-human and human-computer communication. The purpose of this review is to synthesize current novel sensing interfaces and algorithms for hand gesture recognition, and the scope of applications covers rehabilitation, prosthesis control, sign language recognition, and human-computer interaction. Results showed that electrical, dynamic, acoustical/vibratory, and optical sensing were the primary input modalities in gesture recognition interfaces. Two categories of algorithms were identified: 1) classification algorithms for predefined, fixed hand poses and 2) regression algorithms for continuous finger and wrist joint angles. Conventional machine learning algorithms, including linear discriminant analysis, support vector machines, random forests, and non-negative matrix factorization, have been widely used for a variety of gesture recognition applications, and deep learning algorithms have more recently been applied to further facilitate the complex relationship between sensor signals and multi-articulated hand postures. Future research should focus on increasing recognition accuracy with larger hand gesture datasets, improving reliability and robustness for daily use outside of the laboratory, and developing softer, less obtrusive interfaces.

Journal article

Qiu J, Lo FP-W, Jiang S, Tsai Y-Y, Sun Y, Lo Bet al., 2021, Counting bites and recognizing consumed food from videos for passive dietary monitoring., IEEE Journal of Biomedical and Health Informatics, Vol: 25, Pages: 1471-1482, ISSN: 2168-2194

Assessing dietary intake in epidemiological studies are predominantly based on self-reports, which are subjective, inefficient, and also prone to error. Technological approaches are therefore emerging to provide objective dietary assessments. Using only egocentric dietary intake videos, this work aims to provide accurate estimation on individual dietary intake through recognizing consumed food items and counting the number of bites taken. This is different from previous studies that rely on inertial sensing to count bites, and also previous studies that only recognize visible food items but not consumed ones. As a subject may not consume all food items visible in a meal, recognizing those consumed food items is more valuable. A new dataset that has 1,022 dietary intake video clips was constructed to validate our concept of bite counting and consumed food item recognition from egocentric videos. 12 subjects participated and 52 meals were captured. A total of 66 unique food items, including food ingredients and drinks, were labelled in the dataset along with a total of 2,039 labelled bites. Deep neural networks were used to perform bite counting and food item recognition in an end-to-end manner. Experiments have shown that counting bites directly from video clips can reach 74.15% top-1 accuracy (classifying between 0-4 bites in 20-second clips), and a MSE value of 0.312 (when using regression). Our experiments on video-based food recognition also show that recognizing consumed food items is indeed harder than recognizing visible ones, with a drop of 25% in F1 score.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00347538&limit=30&person=true