Self-Driving Cars That Recognize Free Space Can Better Detect Objects

PITTSBURGH, June 16, 2020 — Researchers at Carnegie Mellon University (CMU) have demonstrated that they can significantly improve detection accuracy in self-driving cars by helping the vehicle recognize what it does not see.

Self-driving vehicles use 3D data from lidar to represent objects as a point cloud and then try to match those point clouds to a library of 3D representations of objects. The problem with that, according to Peiyn Hu, a Ph.D. student in CMU’s Robotics Institute, is that the 3D data from the vehicle’s lidar isn’t exactly 3D — the sensor can’t see the occluded parts of an object, and current algorithms don’t reason about such occlusions.

New CMU research shows that what a self-driving car doesn’t see (in green) is as important to navigation as what it actually sees (in red). Courtesy of Carnegie Mellon University.

“Perception systems need to know their unknowns,” Hu said.

Hu’s work enables a self-driving car’s perception systems to consider visibility as it reasons about what its sensors are seeing. In fact, reasoning about visibility is already used when companies build digital maps.

“Map-building fundamentally reasons about what’s empty space and what’s occupied,” said Deva Ramanan, an associate professor of robotics and director of the CMU Argo AI Center for Autonomous Vehicle Research. “But that doesn’t always occur for live, on-the-fly processing of obstacles moving at traffic speeds.”

Hu and his colleagues’ research takes cues from map-making techniques to help the system reason about visibility when trying to recognize objects. When tested against a standard benchmark, the CMU method outperformed the previous top-performing technique, improving detection by 10.7% for cars, 5.3% for pedestrians, 7.4% for trucks, 18.4% for buses, and 16.7% for trailers.

One reason previous systems may not have taken visibility into account is a concern about computation time. But Hu and his team found that was not a problem: Their method takes just 24 ms to run. For comparison, each sweep of the lidar is 100 ms.

The research was presented at the Computer Vision and Pattern Recognition conference.

LPC/Photonics.com - July 2024 VSC On Demand MR NL

Published: June 2020

Glossary

machine vision: Machine vision, also known as computer vision or computer sight, refers to the technology that enables machines, typically computers, to interpret and understand visual information from the world, much like the human visual system. It involves the development and application of algorithms and systems that allow machines to acquire, process, analyze, and make decisions based on visual data. Key aspects of machine vision include: Image acquisition: Machine vision systems use various...
lidar: Lidar, short for light detection and ranging, is a remote sensing technology that uses laser light to measure distances and generate precise, three-dimensional information about the shape and characteristics of objects and surfaces. Lidar systems typically consist of a laser scanner, a GPS receiver, and an inertial measurement unit (IMU), all integrated into a single system. Here is how lidar works: Laser emission: A laser emits laser pulses, often in the form of rapid and repetitive laser...

Browse Cameras & Imaging, Lasers, Optical Components, Test & Measurement, and more.