MIT engineers develop technique that allows robots to find front door without having to map an area in advance

Advertisement

MIT engineers have developed a navigation method that helps last-mile delivery vehicles find the front door without having to map an area in advance.

With the approach developed by MIT engineers, a robot would use clues in its environment to plan out a route to its destination, which wouldn't be described as coordinates on a map, but instead could be described in general semantic terms such as “front door” or “garage.”

So in instances where a robot is charged with the task of delivering a package to someone's front door, it wouldn’t take the robot long to explore the property before identifying its target. The robot also wouldn’t have to rely on maps of specific residences.

“We wouldn’t want to have to make a map of every building that we’d need to visit,” says Michael Everett, a graduate student in MIT’s Department of Mechanical Engineering.

“With this technique, we hope to drop a robot at the end of any driveway and have it find a door.”

MIT notes that researchers have spent recent years introducing robotic systems to natural, semantic language, training them to recognize objects by their semantic labels so that they can visually process, for example, a door as a door, as opposed to a solid, rectangular obstacle.

“Now we have an ability to give robots a sense of what things are, in real-time,” Everett says.

Everett, along with the co-authors of the paper detailing the results of this research, Jonathan How, professor of aeronautics and astronautics at MIT, and Justin Miller of the Ford Motor Company, are using similar semantic techniques as a launch point for their new navigation approach, which utilizes pre-existing algorithms that extract features from visual data to generate a new map of the same scene, represented as semantic clues, or context.

In their case, the researchers built a map of the environment as the robot moved around using an algorithm called semantic SLAM (Simultaneous Localization and Mapping), as well as the semantic labels of each object and a depth image.

The researchers note that robots have been able to recognize and map objects in their environment for what they are using other semantic algorithms, but those algorithms haven’t allowed a robot to make decisions in real time while navigating a new environment, on the most efficient path to take to a semantic destination such as a “front door.”

“Before, exploring was just, plop a robot down and say ‘go,’ and it will move around and eventually get there, but it will be slow,” How says.

The researchers sought to speed up a robot’s path-planning through a semantic, context-colored world, so they developed a new “cost-to-go estimator” algorithm that converts a semantic map created by preexisting SLAM algorithms into a second map, which represents the likelihood of any given location being close to the goal.

“This was inspired by image-to-image translation, where you take a picture of a cat and make it look like a dog,” Everett says.

“The same type of idea happens here where you take one image that looks like a map of the world, and turn it into this other image that looks like the map of the world but now is colored based on how close different points of the map are to the end goal.”

To represent darker regions as locations far from a goal, and lighter regions as areas that are close to the goal, the cost-to-go map is colorized, in gray-scale. So the sidewalk, coded in yellow in a semantic map, might be translated by the cost-to-go algorithm as a darker region in the new map, compared with a driveway, which is more and more lighter as it approaches the front door, which is the lightest region in the new map.

The researchers used satellite images from Bing Maps to train this new algorithm. The maps included 77 houses from one urban and three suburban neighborhoods. According to the researchers, the system converted a semantic map into a cost-to-go map, and mapped out the most efficient path, following lighter regions in the map, to the end goal. For each satellite image, semantic labels and colors were assigned to context features in a typical front yard, such as grey for a front door, blue for a driveway, and green for a hedge.

During the training process, the researchers also applied masks to each image to mimic the partial view that a robot’s camera would likely have as it makes it way through a yard.

“Part of the trick to our approach was [giving the system] lots of partial images,” How explains. “So it really had to figure out how all this stuff was interrelated. That’s part of what makes this work robustly.”

The researchers also tested their approach in a simulation of an image of an entirely new house, outside of the training dataset. First, they used the preexisting SLAM algorithm to generate a semantic map. Then, they applied their new cost-to-go estimator to generate a second map, and path to a goal, which in this case was the front door.

According to researchers, their new cost-to-go technique found the front door 189 percent faster than classical navigation algorithms. Everett notes that the results showcase how robots can use context to efficiently locate a goal even in “unfamiliar, unmapped environments.”

“Even if a robot is delivering a package to an environment it’s never been to, there might be clues that will be the same as other places it’s seen,” Everett says. “So the world may be laid out a little differently, but there’s probably some things in common.”