Imperial College London

ProfessorAndrewDavison

Faculty of EngineeringDepartment of Computing

Professor of Robot Vision
 
 
 
//

Contact

 

+44 (0)20 7594 8316a.davison Website

 
 
//

Assistant

 

Ms Lucy Atthis +44 (0)20 7594 8259

 
//

Location

 

303William Penney LaboratorySouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@inproceedings{McCormac:2018:10.1109/3DV.2018.00015,
author = {McCormac, J and Clark, R and Bloesch, M and Davison, A and Leutenegger, S},
doi = {10.1109/3DV.2018.00015},
pages = {32--41},
publisher = {IEEE},
title = {Fusion++: Volumetric object-level SLAM},
url = {http://dx.doi.org/10.1109/3DV.2018.00015},
year = {2018}
}

RIS format (EndNote, RefMan)

TY  - CPAPER
AB - We propose an online object-level SLAM system which builds a persistent and accurate 3D graph map of arbitrary reconstructed objects. As an RGB-D camera browses a cluttered indoor scene, Mask-RCNN instance segmentations are used to initialise compact per-object Truncated Signed Distance Function (TSDF) reconstructions with object size-dependent resolutions and a novel 3D foreground mask. Reconstructed objects are stored in an optimisable 6DoF pose graph which is our only persistent map representation. Objects are incrementally refined via depth fusion, and are used for tracking, relocalisation and loop closure detection. Loop closures cause adjustments in the relative pose estimates of object instances, but no intra-object warping. Each object also carries semantic information which is refined over time and an existence probability to account for spurious instance predictions. We demonstrate our approach on a hand-held RGB-D sequence from a cluttered office scene with a large number and variety of object instances, highlighting how the system closes loops and makes good use of existing objects on repeated loops. We quantitatively evaluate the trajectory error of our system against a baseline approach on the RGB-D SLAM benchmark, and qualitatively compare reconstruction quality of discovered objects on the YCB video dataset. Performance evaluation shows our approach is highly memory efficient and runs online at 4-8Hz (excluding relocalisation) despite not being optimised at the software level.
AU - McCormac,J
AU - Clark,R
AU - Bloesch,M
AU - Davison,A
AU - Leutenegger,S
DO - 10.1109/3DV.2018.00015
EP - 41
PB - IEEE
PY - 2018///
SN - 2378-3826
SP - 32
TI - Fusion++: Volumetric object-level SLAM
UR - http://dx.doi.org/10.1109/3DV.2018.00015
UR - http://hdl.handle.net/10044/1/65126
ER -