App Categorizes Street Scenes in Real Time

Adept at identifying roads, pedestrians, buildings and more, two smartphone apps could form the basis of an autonomous vehicle navigation system.

Developed by researchers at the University of Cambridge, the apps cannot currently control a driverless car. But the ability to help imaging devices accurately identify where they are and what they're looking at is a vital part of developing autonomous vehicles and robotics, as well as collision warning systems, the researchers said.

The first system, called SegNet, can take an image of a street it hasn't seen before and classify it, sorting objects into 12 different categories — such as roads, signs, buildings, pedestrians and cyclists — in real time.

It can deal with light, shadow and nighttime environments, and labels more than 90 percent of pixels correctly, the researchers said; previous systems using expensive laser- or radar-based sensors have not been able to reach this level of accuracy while operating in real time.

Users can visit the SegNet website and upload an image or search for any city or town in the world, and the system will label all the components of the road scene. The system has been successfully tested on both city roads and motorways.

To "train" SegNet, the researchers had help from a team of undergraduates who manually labelled every pixel in each of 5,000 images. The researchers then instructed the system how to label new scenes itself based on this information.

Stanley Electric Co. Ltd. - IR Light Sources 4/24 MR

In the real world, SegNet has been used primarily in highway and urban environments. It has performed well in initial tests in rural, snowy and desert environments, the researcher said.

"It's remarkably good at recognizing things in an image because it's had so much practice," said doctoral student Alex Kendall. "However, there are a million knobs that we can turn to fine-tune the system so that it keeps getting better."

A second system designed by Kendall and professor Roberto Cipolla can determine a user's location and orientation from a single color image of a busy urban scene.

Tested on a kilometer-long stretch of King's Parade in central Cambridge, the Visual Localisation system determined location and orientation to within a few meters and a few degrees, which is more accurate than GPS, the researchers said.

The system uses the geometry of a scene to learn its precise location, and is able to determine, for example, whether it is looking at the east or west side of a building, even if the two sides appear identical.

"In the short term, we're more likely to see this sort of system on a domestic robot — such as a robotic vacuum cleaner, for instance," said Cipolla. "It will take time before drivers can fully trust an autonomous car, but the more effective and accurate we can make these technologies, the closer we are to the widespread adoption of driverless cars and other types of autonomous robotics."

The researchers presented the two technologies at the International Conference on Computer Vision last week in Santiago, Chile.

Published: December 2015

Glossary

machine vision: Machine vision, also known as computer vision or computer sight, refers to the technology that enables machines, typically computers, to interpret and understand visual information from the world, much like the human visual system. It involves the development and application of algorithms and systems that allow machines to acquire, process, analyze, and make decisions based on visual data. Key aspects of machine vision include: Image acquisition: Machine vision systems use various...

Browse Cameras & Imaging, Lasers, Optical Components, Test & Measurement, and more.