Researchers closer to developing vision for household robots

16 December 2015

Scientists at Imperial are developing vision technology that could enable the domestic robots of the future to truly see their environment.

Play video

Computer vision has long been an important component in automated industrial processes. For instance, static cameras facing a conveyor belt are used so that robots can identify faulty products. Recently, computer vision technology has moved into the home with the introduction of vacuum cleaners that can crudely sense where they are in relation to the environment they are in. However, this technology is still in its infancy and does not give robots the information required to negotiate and carry out more complex tasks.

Now, researchers from the Dyson Robotics Lab at the Imperial have developed technology called ElasticFusion. This prototype technology could enable a robot to create a clearer representation of the environment that they are in and simultaneously determine where it is in that environment – an important initial step towards enabling robots to work closely with humans at home.

Ultimately, the team is aiming for the technology to not only map the geometry of an environment, but also label different items within the environment. This would allow for more sophisticated processes to take place, such as robots being able identify devices such as the toaster or dishwasher, accessing information over the internet about how to use the appliance, and operating it.

The ElasticFusion technology was developed by Dyson Fellow Dr Tom Whelan in collaboration with Dr Stefan Leutenegger and Professor Andrew Davison who are all from the Lab.

Professor Andrew Davison, Director of the Dyson Robotics Laboratory at Imperial, said: “The family home is actually quite a complex environment for a robot to map. Houses are filled with breakable objects, family members that constantly move about, and a range of complex appliances that need to be operated safely. Domestic robots will need to negotiate all these challenges with aplomb to become a useful tool for making our lives easier. Elastic Fusion is the first step towards making vision that can help a robot to negotiate the complexities of the home.”

The hardware consists of an off-the-shelf depth sensing camera, which turns what it sees in its field of view into a millions of pixels. This enables the technology to record the colour and distance of each pixel within the field of view.

The data is then fed into a computer program that maps a room in real-time. The program takes the recorded information about each pixel and turns it into an initial 3D map. The next two steps in the process are alternated when the camera is moved around to explore the environment. The first, called ‘tracking’, is where the camera position and orientation is found that best matches the depth and colour images with the initial 3D map. The second, called ‘mapping’, takes the new depth and colour image data and fuses it into the initial 3D map. This process can be repeated over and over again in real-time to track the camera and map more of the room.

When the camera moves around this causes inevitable errors, which skews the map. The team say the advantage of their technology is that these errors can be rectified in a fraction of a second, which is extremely important if robots in the future are to see accurately. This is because ElasticFusion is programed to constantly check if the field of vision has returned to a part of the room it has previously observed. If it has, the program elastically bends the whole 3D map around, so that the old part of the map and the new part are perfectly re-aligned again.

One of the current drawbacks with ElasticFusion is that most of the depth cameras currently on the market only work indoors and they can only create images within a visual range of between 40 centimetres and up to five metres. A limitation of the computer programing is that it cannot currently allow for proper mapping of moving objects such as people. The technology also relies on the same graphics cards used in gaming technology, which has constraints in terms of power consumption and processing capacity.

The team are now aiming to improve the mapping technology and address scalability of the method so that larger spaces can be mapped faster, using less memory. They also want the technology to be able to distinguish moveable objects within the scene such as people and pets.

The software is available to download under a free license for research purposes.

Article text (excluding photos or graphics) © Imperial College London.

Photos and graphics subject to third party copyright used with permission or © Imperial College London.