Imperial College London

ProfessorAndrewDavison

Faculty of EngineeringDepartment of Computing

Professor of Robot Vision
 
 
 
//

Contact

 

+44 (0)20 7594 8316a.davison Website

 
 
//

Assistant

 

Mrs Marina Hall +44 (0)20 7594 8259

 
//

Location

 

303William Penney LaboratorySouth Kensington Campus

//

Summary

 

Publications

Publication Type
Year
to

91 results found

McCormac, Handa A, Leutenegger S, Davison AJet al., SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation?, International Conference on Computer Vision 2017, Publisher: IEEE

CONFERENCE PAPER

Lukierski R, Leutenegger S, Davison AJ, 2017, Room layout estimation from rapid omnidirectional exploration, Pages: 6315-6322, ISSN: 1050-4729

© 2017 IEEE. A new generation of practical, low-cost indoor robots is now using wide-angle cameras to aid navigation, but usually this is limited to position estimation via sparse feature-based SLAM. Such robots usually have little global sense of the dimensions, demarcation or identities of the rooms they are in, information which would be very useful to enable behaviour with much more high level intelligence. In this paper we show that we can augment an omni-directional SLAM pipeline with straightforward dense stereo estimation and simple and robust room model fitting to obtain rapid and reliable estimation of the global shape of typical rooms from short robot motions. We have tested our method extensively in real homes, offices and on synthetic data. We also give examples of how our method can extend to making composite maps of larger rooms, and detecting room transitions.

CONFERENCE PAPER

McCormac J, Handa A, Davison AJ, Leutenegger Set al., 2017, SemanticFusion: dense 3D semantic mapping with convolutional neural networks, IEEE International Conference on Robotics and Automation (ICRA), 2017, Publisher: IEEE

Ever more robust, accurate and detailed mapping using visual sensing has proven to be an enabling factor for mobile robots across a wide variety of applications. For the next level of robot intelligence and intuitive user interaction, maps need to extend beyond geometry and appearance — they need to contain semantics. We address this challenge by combining Convolutional Neural Networks (CNNs) and a state-of-the-art dense Simultaneous Localization and Mapping (SLAM) system, ElasticFusion, which provides long-term dense correspondences between frames of indoor RGB-D video even during loopy scanning trajectories. These correspondences allow the CNN's semantic predictions from multiple view points to be probabilistically fused into a map. This not only produces a useful semantic 3D map, but we also show on the NYUv2 dataset that fusing multiple predictions leads to an improvement even in the 2D semantic labelling over baseline single frame predictions. We also show that for a smaller reconstruction dataset with larger variation in prediction viewpoint, the improvement over single frame segmentation increases. Our system is efficient enough to allow real-time interactive use at frame-rates of ≈25Hz.

CONFERENCE PAPER

Nardi L, Bodin B, Saeedi S, Vespa E, Davison AJ, Kelly PHJet al., 2017, Algorithmic performance-accuracy trade-off in 3D vision applications using HyperMapper, Pages: 1434-1443

© 2017 IEEE. In this paper we investigate an emerging application, 3D scene understanding, likely to be significant in the mobile space in the near future. The goal of this exploration is to reduce execution time while meeting our quality of result objectives. In previous work, we showed for the first time that it is possible to map this application to power constrained embedded systems, highlighting that decision choices made at the algorithmic design-level have the most significant impact. As the algorithmic design space is too large to be exhaustively evaluated, we use a previously introduced multi-objective random forest active learning prediction framework dubbed HyperMapper, to find good algorithmic designs. We show that HyperMapper generalizes on a recent cutting edge 3D scene understanding algorithm and on a modern GPU-based computer architecture. HyperMapper is able to beat an expert human hand-tuning the algorithmic parameters of the class of computer vision applications taken under consideration in this paper automatically. In addition, we use crowd-sourcing using a 3D scene understanding Android app to show that the Pareto front obtained on an embedded system can be used to accelerate the same application on all the 83 smart-phones and tablets with speedups ranging from 2x to over 12x.

CONFERENCE PAPER

Platinsky L, Davison AJ, Leutenegger S, 2017, Monocular visual odometry: Sparse joint optimisation or dense alternation?, Pages: 5126-5133, ISSN: 1050-4729

© 2017 IEEE. Real-time monocular SLAM is increasingly mature and entering commercial products. However, there is a divide between two techniques providing similar performance. Despite the rise of 'dense' and 'semi-dense' methods which use large proportions of the pixels in a video stream to estimate motion and structure via alternating estimation, they have not eradicated feature-based methods which use a significantly smaller amount of image information from keypoints and retain a more rigorous joint estimation framework. Dense methods provide more complete scene information, but in this paper we focus on how the amount of information and different optimisation methods affect the accuracy of local motion estimation (monocular visual odometry). This topic becomes particularly relevant after the recent results from a direct sparse system. We propose a new method for fairly comparing the accuracy of SLAM frontends in a common setting. We suggest computational cost models for an overall comparison which indicates that there is relative parity between the approaches at the settings allowed by current serial processors when evaluated under equal conditions.

CONFERENCE PAPER

Saeedi S, Nardi L, Johns E, Bodin B, Kelly PHJ, Davison AJet al., 2017, Application-oriented design space exploration for SLAM algorithms, Pages: 5716-5723, ISSN: 1050-4729

© 2017 IEEE. In visual SLAM, there are many software and hardware parameters, such as algorithmic thresholds and GPU frequency, that need to be tuned; however, this tuning should also take into account the structure and motion of the camera. In this paper, we determine the complexity of the structure and motion with a few parameters calculated using information theory. Depending on this complexity and the desired performance metrics, suitable parameters are explored and determined. Additionally, based on the proposed structure and motion parameters, several applications are presented, including a novel active SLAM approach which guides the camera in such a way that the SLAM algorithm achieves the desired performance metrics. Real-world and simulated experimental results demonstrate the effectiveness of the proposed design space and its applications.

CONFERENCE PAPER

Tsiotsios C, Davison AJ, Kim T-K, 2017, Near-lighting Photometric Stereo for unknown scene distance and medium attenuation, IMAGE AND VISION COMPUTING, Vol: 57, Pages: 44-57, ISSN: 0262-8856

JOURNAL ARTICLE

Bardow P, Davison AJ, Leutenegger S, 2016, Simultaneous Optical Flow and Intensity Estimation from an Event Camera, 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Publisher: IEEE, Pages: 884-892, ISSN: 1063-6919

CONFERENCE PAPER

Handa A, Bloesch M, Patraucean V, Stent S, McCormac J, Davison Aet al., 2016, gvnn: Neural Network Library for Geometric Computer Vision, 14th European Conference on Computer Vision (ECCV), Publisher: SPRINGER INT PUBLISHING AG, Pages: 67-82, ISSN: 0302-9743

CONFERENCE PAPER

Johns E, Leutenegger S, Davison AJ, 2016, Pairwise Decomposition of Image Sequences for Active Multi-View Recognition, 29th IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Publisher: IEEE, Pages: 3813-3822, ISSN: 1063-6919

CONFERENCE PAPER

Johns E, Leutenegger S, Davison AJ, 2016, Deep Learning a Grasp Function for Grasping under Gripper Pose Uncertainty, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 4461-4468

CONFERENCE PAPER

Kim H, Leutenegger S, Davison AJ, 2016, Real-Time 3D Reconstruction and 6-DoF Tracking with an Event Camera, 14th European Conference on Computer Vision (ECCV), Publisher: SPRINGER INT PUBLISHING AG, Pages: 349-364, ISSN: 0302-9743

CONFERENCE PAPER

Strasdat H, Montiel JMM, Davison AJ, 2016, Visual SLAM: Why filter?, Image and Vision Computing, ISSN: 0262-8856

While the most accurate solution to off-line structure from motion (SFM) problems is undoubtedly to extract as much correspondence information as possible and perform batch optimisation, sequential methods suitable for live video streams must approximate this to fit within fixed computational bounds. Two quite different approaches to real-time SFM - also called visual SLAM (simultaneous localisation and mapping) - have proven successful, but they sparsify the problem in different ways. Filtering methods marginalise out past poses and summarise the information gained over time with a probability distribution. Keyframe methods retain the optimisation approach of global bundle adjustment, but computationally must select only a small number of past frames to process. In this paper we perform a rigorous analysis of the relative advantages of filtering and sparse bundle adjustment for sequential visual SLAM. In a series of Monte Carlo experiments we investigate the accuracy and cost of visual SLAM. We measure accuracy in terms of entropy reduction as well as root mean square error (RMSE), and analyse the efficiency of bundle adjustment versus filtering using combined cost/accuracy measures. In our analysis, we consider both SLAM using a stereo rig and monocular SLAM as well as various different scenes and motion patterns. For all these scenarios, we conclude that keyframe bundle adjustment outperforms filtering, since it gives the most accuracy per unit of computing time. © 2012 Elsevier B.V. All rights reserved.

JOURNAL ARTICLE

Tsiotsios C, Kim TK, Davison AJ, Narasimhan SGet al., 2016, Model effectiveness prediction and system adaptation for photometric stereo in murky water, COMPUTER VISION AND IMAGE UNDERSTANDING, Vol: 150, Pages: 126-138, ISSN: 1077-3142

JOURNAL ARTICLE

Whelan T, Salas-Moreno RF, Glocker B, Davison AJ, Leutenegger Set al., 2016, ElasticFusion: Real-time dense SLAM and light source estimation, INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, Vol: 35, Pages: 1697-1716, ISSN: 0278-3649

JOURNAL ARTICLE

Zia MZ, Nardi L, Jack A, Vespa E, Bodin B, Kelly PHJ, Davison AJet al., 2016, Comparative Design Space Exploration of Dense and Semi-Dense SLAM, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE, Pages: 1292-1299, ISSN: 1050-4729

CONFERENCE PAPER

Zienkiewicz J, Davison A, Leutenegger S, 2016, Real-Time Height Map Fusion using Differentiable Rendering, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Publisher: IEEE, Pages: 4280-4287

CONFERENCE PAPER

Zienkiewicz J, Tsiotsios A, Davison A, Leutenegger Set al., 2016, Monocular, Real-Time Surface Reconstruction using Dynamic Level of Detail, 4th IEEE International Conference on 3D Vision (3DV), Publisher: IEEE, Pages: 37-46

CONFERENCE PAPER

Jachnik J, Goldman DB, Luo L, Davison AJet al., 2015, Interactive 3D Face Stylization Using Sculptural Abstraction., CoRR, Vol: abs/1502.01954

JOURNAL ARTICLE

Lukierski R, Leutenegger S, Davison AJ, 2015, Rapid Free-Space Mapping From a Single Omnidirectional Camera, European Conference on Mobile Robots, Publisher: IEEE

CONFERENCE PAPER

Milford M, Kim H, Mangan M, Leutenegger S, Stone T, Webb B, Davison AJet al., 2015, Place Recognition with Event-based Cameras and a Neural Implementation of SeqSLAM., CoRR, Vol: abs/1505.04548

JOURNAL ARTICLE

Nardi L, Bodin B, Zia MZ, Mawer J, Nisbet A, Kelly PHJ, Davison AJ, Luján M, O'Boyle MFP, Riley GD, Topham N, Furber SBet al., 2015, Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM., Publisher: IEEE, Pages: 5783-5790

CONFERENCE PAPER

Nardi L, Bodin B, Zia MZ, Mawer J, Nisbet A, Kelly PHJ, Davison AJ, Lujan M, O'Boyle MEP, Riley G, Topham N, Furber Set al., 2015, Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE COMPUTER SOC, Pages: 5783-5790, ISSN: 1050-4729

CONFERENCE PAPER

Whelan T, Leutenegger S, Salas-Moreno RF, Glocker B, Davison AJet al., 2015, ElasticFusion: Dense SLAM without a pose graph

© 2015, MIT Press Journals. All rights reserved. We present a novel approach to real-time dense visual SLAM. Our system is capable of capturing comprehensive dense globally consistent surfel-based maps of room scale environments explored using an RGB-D camera in an incremental online fashion, without pose graph optimisation or any post-processing steps. This is accomplished by using dense frame-to-model camera tracking and windowed surfel-based fusion coupled with frequent model refinement through non-rigid surface deformations. Our approach applies local model-to-model surface loop closure optimisations as often as possible to stay close to the mode of the map distribution, while utilising global loop closure to recover from arbitrary drift and maintain global consistency.

CONFERENCE PAPER

Zienkiewicz J, Davison A, 2015, Extrinsics Autocalibration for Dense Planar Visual Odometry, JOURNAL OF FIELD ROBOTICS, Vol: 32, Pages: 803-825, ISSN: 1556-4959

JOURNAL ARTICLE

Chang PL, Handa A, Davison AJ, Stoyanov D, Edwards PEet al., 2014, Robust real-time visual odometry for stereo endoscopy using dense quadrifocal tracking, Pages: 11-20, ISSN: 0302-9743

Visual tracking in endoscopic scenes is known to be a difficult task due to the lack of textures, tissue deformation and specular reflection. In this paper, we devise a real-time visual odometry framework to robustly track the 6-DoF stereo laparoscope pose using the quadrifocal relationship. The instant motion of a stereo camera creates four views which can be constrained by the quadrifocal geometry. Using the previous stereo pair as a reference frame, the current pair can be warped back by minimising a photometric error function with respect to a camera pose constrained by the quadrifocal geometry. Using a robust estimator can further remove the outliers caused by occlusion, deformation and specular highlights during the optimisation. Since the optimisation uses all pixel data in the images, it results in a very robust pose estimation even for a textureless scene. The quadrifocal geometry is initialised by using real-time stereo reconstruction algorithm which can be efficiently parallelised and run on the GPU together with the proposed tracking framework. Our system is evaluated using a ground truth synthetic sequence with a known model and we also demonstrate the accuracy and robustness of the approach using phantom and real examples of endoscopic augmented reality. © 2014 Springer International Publishing Switzerland.

CONFERENCE PAPER

Chang PL, Handa A, Davison AJ, Stoyanov D, Edwards PEet al., 2014, Robust real-time visual odometry for stereo endoscopy using dense quadrifocal tracking, Pages: 11-20, ISSN: 0302-9743

Visual tracking in endoscopic scenes is known to be a difficult task due to the lack of textures, tissue deformation and specular reflection. In this paper, we devise a real-time visual odometry framework to robustly track the 6-DoF stereo laparoscope pose using the quadrifocal relationship. The instant motion of a stereo camera creates four views which can be constrained by the quadrifocal geometry. Using the previous stereo pair as a reference frame, the current pair can be warped back by minimising a photometric error function with respect to a camera pose constrained by the quadrifocal geometry. Using a robust estimator can further remove the outliers caused by occlusion, deformation and specular highlights during the optimisation. Since the optimisation uses all pixel data in the images, it results in a very robust pose estimation even for a textureless scene. The quadrifocal geometry is initialised by using real-time stereo reconstruction algorithm which can be efficiently parallelised and run on the GPU together with the proposed tracking framework. Our system is evaluated using a ground truth synthetic sequence with a known model and we also demonstrate the accuracy and robustness of the approach using phantom and real examples of endoscopic augmented reality. © 2014 Springer International Publishing Switzerland.

CONFERENCE PAPER

Handa A, Whelan T, McDonald J, Davison AJet al., 2014, A Benchmark for RGB-D Visual Odometry, 3D Reconstruction and SLAM, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE, Pages: 1524-1531, ISSN: 1050-4729

CONFERENCE PAPER

Kim H, Handa A, Benosman R, Ieng SH, Davison AJet al., 2014, Simultaneous mosaicing and tracking with an event camera

© 2014. The copyright of this document resides with its authors. An event camera is a silicon retina which outputs not a sequence of video frames like a standard camera, but a stream of asynchronous spikes, each with pixel location, sign and precise timing, indicating when individual pixels record a threshold log intensity change. By encoding only image change, it offers the potential to transmit the information in a standard video but at vastly reduced bitrate, and with huge added advantages of very high dynamic range and temporal resolution. However, event data calls for new algorithms, and in particular we believe that algorithms which incrementally estimate global scene models are best placed to take full advantages of its properties. Here, we show for the first time that an event stream, with no additional sensing, can be used to track accurate camera rotation while building a persistent and high quality mosaic of a scene which is super-resolution accurate and has high dynamic range. Our method involves parallel camera rotation tracking and template reconstruction from estimated gradients, both operating on an event-by-event basis and based on probabilistic filtering.

CONFERENCE PAPER

Salas-Moreno RF, Glocker B, Kelly PHJ, Davison AJet al., 2014, Dense Planar SLAM, IEEE International Symposium on Mixed and Augmented Reality (ISMAR) - Science and Technology, Publisher: IEEE, Pages: 157-164, ISSN: 1554-7868

CONFERENCE PAPER

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00450245&limit=30&person=true