55 results found
Milford M, Kim H, Mangan M, et al., 2015, Place Recognition with Event-based Cameras and a Neural Implementation of SeqSLAM
Event-based cameras offer much potential to the fields of robotics andcomputer vision, in part due to their large dynamic range and extremely high"frame rates". These attributes make them, at least in theory, particularlysuitable for enabling tasks like navigation and mapping on high speed roboticplatforms under challenging lighting conditions, a task which has beenparticularly challenging for traditional algorithms and camera sensors. Beforethese tasks become feasible however, progress must be made towards adapting andinnovating current RGB-camera-based algorithms to work with event-basedcameras. In this paper we present ongoing research investigating two distinctapproaches to incorporating event-based cameras for robotic navigation: theinvestigation of suitable place recognition / loop closure techniques, and thedevelopment of efficient neural implementations of place recognition techniquesthat enable the possibility of place recognition using event-based cameras atvery high frame rates using neuromorphic computing hardware.
Oettershagen P, Melzer A, Mantel T, et al., 2015, A Solar-Powered Hand-Launchable UAV for Low-Altitude Multi-Day Continuous Flight, IEEE International Conference on Robotics and Automation (ICRA), Publisher: IEEE COMPUTER SOC, Pages: 3986-3993, ISSN: 1050-4729
Leutenegger S, Lynen S, Bosse M, et al., 2014, Keyframe-based visual–inertial odometry using nonlinear optimization, The International Journal of Robotics Research, Vol: 34, Pages: 314-334, ISSN: 0278-3649
Combining visual and inertial measurements has become popular in mobile robotics, since the two sensing modalities offer complementary characteristics that make them the ideal choice for accurate visual–inertial odometry or simultaneous localization and mapping (SLAM). While historically the problem has been addressed with filtering, advancements in visual estimation suggest that nonlinear optimization offers superior accuracy, while still tractable in complexity thanks to the sparsity of the underlying problem. Taking inspiration from these findings, we formulate a rigorously probabilistic cost function that combines reprojection errors of landmarks and inertial terms. The problem is kept tractable and thus ensuring real-time operation by limiting the optimization to a bounded window of keyframes through marginalization. Keyframes may be spaced in time by arbitrary intervals, while still related by linearized inertial terms. We present evaluation results on complementary datasets recorded with our custom-built stereo visual–inertial hardware that accurately synchronizes accelerometer and gyroscope measurements with imagery. A comparison of both a stereo and monocular version of our algorithm with and without online extrinsics estimation is shown with respect to ground truth. Furthermore, we compare the performance to an implementation of a state-of-the-art stochastic cloning sliding-window filter. This competitive reference implementation performs tightly coupled filtering-based visual–inertial odometry. While our approach declaredly demands more computation, we show its superior performance in terms of accuracy.
Leutenegger S, 2014, Unmanned Solar Airplanes: Design and Algorithms for Efficient and Robust Autonomous Operation
Nikolic J, Rehder J, Burri M, et al., 2014, A Synchronized Visual-Inertial Sensor System with FPGA Pre-Processing for Accurate Real-Time SLAM
Leutenegger S, Melzer A, Alexis K, et al., 2014, Robust state estimation for small unmanned airplanes, Pages: 1003-1010
Oettershagen P, Melzer A, Leutenegger S, et al., 2014, Explicit model predictive control and L 1-navigation strategies for fixed-wing UAV path tracking, Pages: 1159-1165
Leutenegger S, Furgale P, Rabaud V, et al., Keyframe-Based Visual-Inertial SLAM using Nonlinear Optimization, Robotics: Science and Systems (RSS), 2013, ISSN: 2330-765X
Nikolic J, Burri M, Rehder J, et al., 2013, A UAV system for inspection of industrial facilities, Pages: 1-8
Burri M, Gasser L, Kach M, et al., 2013, Design and control of a spherical omnidirectional blimp, Pages: 1873-1879
Bloesch M, Hutter M, Hoepflinger MA, et al., 2013, State estimation for legged robots-consistent fusion of leg kinematics and IMU, Publisher: MIT Press, Pages: 17-17
Marconi L, Melchiorri C, Beetz M, et al., 2012, The SHERPA project: smart collaboration between humans and ground-aerial robots for improving rescuing activities in alpine environments, IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), Publisher: IEEE
Leutenegger S, Siegwart RY, 2012, A low-cost and fail-safe inertial navigation system for airplanes, Pages: 612-618
Leutenegger S, Jabas M, Siegwart RY, 2011, Solar Airplane Conceptual Design and Performance Estimation What Size to Choose and What Endurance to Expect, JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, Vol: 61, Pages: 545-561, ISSN: 0921-0296
Fankhauser P, Bouabdallah S, Leutenegger S, et al., 2011, Modeling and decoupling control of the coax micro helicopter, Pages: 2223-2228
Leutenegger S, Jabas M, Siegwart RY, 2011, Solar Airplane Conceptual Design and Performance Estimation, Journal of Intelligent & Robotic Systems, Vol: 61, Pages: 545-561
Leutenegger S, Chli M, Siegwart RY, 2011, BRISK: Binary robust invariant scalable keypoints, Pages: 2548-2555
Bermes C, Leutenegger S, Bouabdallah S, et al., 2008, New design of the steering mechanism for a mini coaxial helicopter, Pages: 1236-1241
Bell DJ, Leutenegger S, Hammar KM, et al., 2007, Flagella-like propulsion for microrobots using a nanocoil and a rotating electromagnetic field, IEEE International Conference on Robotics and Automation, Publisher: IEEE, Pages: 1128-+, ISSN: 1050-4729
Li W, Saeedi S, McCormac J, et al., InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset
Datasets have gained an enormous amount of popularity in the computer visioncommunity, from training and evaluation of Deep Learning-based methods tobenchmarking Simultaneous Localization and Mapping (SLAM). Without a doubt,synthetic imagery bears a vast potential due to scalability in terms of amountsof data obtainable without tedious manual ground truth annotations ormeasurements. Here, we present a dataset with the aim of providing a higherdegree of photo-realism, larger scale, more variability as well as serving awider range of purposes compared to existing datasets. Our dataset leveragesthe availability of millions of professional interior designs and millions ofproduction-level furniture and object assets -- all coming with fine geometricdetails and high-resolution texture. We render high-resolution and highframe-rate video sequences following realistic trajectories while supportingvarious camera types as well as providing inertial measurements. Together withthe release of the dataset, we will make executable program of our interactivesimulator software as well as our renderer available athttps://interiornetdataset.github.io. To showcase the usability and uniquenessof our dataset, we show benchmarking results of both sparse and dense SLAMalgorithms.
McCormac J, Handa A, Leutenegger S, et al., SceneNet RGB-D: 5M Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth
We introduce SceneNet RGB-D, expanding the previous work of SceneNet toenable large scale photorealistic rendering of indoor scene trajectories. Itprovides pixel-perfect ground truth for scene understanding problems such assemantic segmentation, instance segmentation, and object detection, and alsofor geometric computer vision problems such as optical flow, depth estimation,camera pose estimation, and 3D reconstruction. Random sampling permitsvirtually unlimited scene configurations, and here we provide a set of 5Mrendered RGB-D images from over 15K trajectories in synthetic layouts withrandom but physically simulated object poses. Each layout also has randomlighting, camera trajectories, and textures. The scale of this dataset is wellsuited for pre-training data-driven computer vision techniques from scratchwith RGB-D inputs, which previously has been limited by relatively smalllabelled datasets in NYUv2 and SUN RGB-D. It also provides a basis forinvestigating 3D scene labelling tasks by providing perfect camera poses anddepth data as proxy for a SLAM system. We host the dataset athttp://robotvault.bitbucket.io/scenenet-rgbd.html
Bloesch M, Sommer H, Laidlow T, et al., A Primer on the Differential Calculus of 3D Orientations
The proper handling of 3D orientations is a central element in manyoptimization problems in engineering. Unfortunately many researchers andengineers struggle with the formulation of such problems and often fall back tosuboptimal solutions. The existence of many different conventions furthercomplicates this issue, especially when interfacing multiple differingimplementations. This document discusses an alternative approach which makesuse of a more abstract notion of 3D orientations. The relative orientationbetween two coordinate systems is primarily identified by the coordinatemapping it induces. This is combined with the standard exponential map in orderto introduce representation-independent and minimal differentials, which arevery convenient in optimization based methods.
Zhi S, Bloesch M, Leutenegger S, et al., SceneCode: Monocular Dense Semantic Reconstruction using Learned Encoded Scene Representations
Systems which incrementally create 3D semantic maps from image sequences muststore and update representations of both geometry and semantic entities.However, while there has been much work on the correct formulation forgeometrical estimation, state-of-the-art systems usually rely on simplesemantic representations which store and update independent label estimates foreach surface element (depth pixels, surfels, or voxels). Spatial correlation isdiscarded, and fused label maps are incoherent and noisy. We introduce a new compact and optimisable semantic representation bytraining a variational auto-encoder that is conditioned on a colour image.Using this learned latent space, we can tackle semantic label fusion by jointlyoptimising the low-dimenional codes associated with each of a set ofoverlapping images, producing consistent fused label maps which preservespatial correlation. We also show how this approach can be used within amonocular keyframe based semantic mapping system where a similar code approachis used for geometry. The probabilistic formulation allows a flexibleformulation where we can jointly estimate motion, geometry and semantics in aunified optimisation.
Gallego G, Delbruck T, Orchard G, et al., Event-based Vision: A Survey
Event cameras are bio-inspired sensors that work radically different fromtraditional cameras. Instead of capturing images at a fixed rate, they measureper-pixel brightness changes asynchronously. This results in a stream ofevents, which encode the time, location and sign of the brightness changes.Event cameras posses outstanding properties compared to traditional cameras:very high dynamic range (140 dB vs. 60 dB), high temporal resolution (in theorder of microseconds), low power consumption, and do not suffer from motionblur. Hence, event cameras have a large potential for robotics and computervision in challenging scenarios for traditional cameras, such as high speed andhigh dynamic range. However, novel methods are required to process theunconventional output of these sensors in order to unlock their potential. Thispaper provides a comprehensive overview of the emerging field of event-basedvision, with a focus on the applications and the algorithms developed to unlockthe outstanding properties of event cameras. We present event cameras fromtheir working principle, the actual sensors that are available and the tasksthat they have been used for, from low-level vision (feature detection andtracking, optic flow, etc.) to high-level vision (reconstruction, segmentation,recognition). We also discuss the techniques developed to process events,including learning-based techniques, as well as specialized processors forthese novel sensors, such as spiking neural networks. Additionally, wehighlight the challenges that remain to be tackled and the opportunities thatlie ahead in the search for a more efficient, bio-inspired way for machines toperceive and interact with the world.
Clark R, Bloesch M, Czarnowski J, et al., LS-Net: Learning to Solve Nonlinear Least Squares for Monocular Stereo
Sum-of-squares objective functions are very popular in computer visionalgorithms. However, these objective functions are not always easy to optimize.The underlying assumptions made by solvers are often not satisfied and manyproblems are inherently ill-posed. In this paper, we propose LS-Net, a neuralnonlinear least squares optimization algorithm which learns to effectivelyoptimize these cost functions even in the presence of adversities. Unliketraditional approaches, the proposed solver requires no hand-craftedregularizers or priors as these are implicitly learned from the data. We applyour method to the problem of motion stereo ie. jointly estimating the motionand scene geometry from pairs of images of a monocular sequence. We show thatour learned optimizer is able to efficiently and effectively solve thischallenging optimization problem.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.