Image-Free Single-Pixel Detection Finds Play in Autonomous Mobility

Researchers from the Beijing Institute of Technology have developed a high-speed method to detect the location, size, and category of multiple objects without acquiring images or requiring complex scene reconstruction. The technique, which uses a single-pixel detector, greatly decreases the computing power necessary for object detection. The approach could be useful in the identification hazards while driving, as it performs classification, single-object recognition, and tracking at the same time.

According to research team leader Liheng Bian, the single-pixel object detection (SPOD) technique performs multi-object detection directly from only a small number of 2D measurements. “This type of image-free sensing technology is expected to solve the problems of heavy communication load, high computing overhead, and low perception rate of existing visual perception systems,” Bian said.

Current image-free perception methods can only achieve classification, single-object recognition, or tracking. The SPOD method achieved an object detection accuracy of just over 80%, accomplishing classification, single-object recognition, and tracking at once.

Researchers have developed a high-speed method to detect the location, size and category of multiple objects without acquiring images or requiring complex scene reconstruction. Courtesy of Lintao Peng, Beijing Institute of Technology.

Researchers have developed a high-speed method to detect the location, size, and category of multiple objects without requiring complex scene reconstruction. The technology is expected to solve the problems of heavy communication load, high computing overhead, and low perception rate of existing visual perception systems, according to research team leader Liheng Bian from the Beijing Institute of Technology. Courtesy of Lintao Peng/Beijing Institute of Technology.

Once 2D measurements are obtained using the researchers' method, the measurements are fed into a transformer-based encoder — a type of deep learning model — and the high-dimensional, most relevant features in the scene are extracted. These features are fed into a multiscale attention network-based decoder, which outputs the class, location, and size information of all the targets in the scene simultaneously.

Automating advanced visual tasks typically requires detailed images of a scene to extract the features necessary to identify an object. However, this requires either complex imaging hardware or complicated reconstruction algorithms, which leads to high computational cost, long running time, and heavy data transmission load. For this reason, the traditional image-first, perceive-later approaches may not be best for object detection.

Meadowlark Optics - Building system MR 7/23

Image-free sensing methods based on single-pixel detectors can cut down on the computational power needed for object detection. Instead of employing a pixelated detector such as a CMOS or CCD, single-pixel imaging illuminates the scene with a sequence of structured light patterns and then records the transmitted light intensity to acquire the spatial information of objects. This information is then used to computationally reconstruct the object or to calculate its properties.

According to the researchers, the small-size, optimized pattern sampling used by SPOD achieves high image-free sensing accuracy with about one order of magnitude fewer pattern parameters than this conventional pattern sampling method.

“Compared to the full-size pattern used by other single-pixel detection methods, the small, optimized pattern produces better image-free sensing performance,” researcher Lintao Peng said.

Further, Peng said, “The multiscale attention network in the SPOD decoder reinforces the network’s attention to the target area in the scene. This allows more efficient extraction of scene features, enabling state-of-the art object detection performance.”

“For autonomous driving, SPOD could be used with lidar to help improve scene reconstruction speed and object detection accuracy, ” Bian said. "We believe that it has a high enough detection rate and accuracy for autonomous driving while also reducing the transmission bandwidth and computing resource requirements needed for object detection.”

To experimentally demonstrate SPOD, the researchers built a proof-of-concept setup. Images randomly selected from the Pascal Voc 2012 test data set were printed on film and used as target scenes. At a sampling rate of 5%, the average time to complete spatial light modulation and image-free object detection per scene with SPOD was just 0.016 s. This is a significant boost over methods performing scene reconstruction first (0.05 s) and then object detection (0.018 s). SPOD showed an average detection accuracy of 82.2% with a refresh rate of 63 frames per second for all the object classes included in the test data set.

“Currently, SPOD cannot detect every possible object category because the existing object detection data set used to train the model only contains 80 categories,” Peng said. “However, when faced with a specific task, the pre-trained model can be fine-tuned to achieve image-free multi-object detection of new target classes for applications such as pedestrian, vehicle, or boat detection.”

The researchers plan to extend the image-free perception technology to other kinds of detectors and computational acquisition systems to achieve reconstruction-free sensing technology.

The research was published in Optics Letters (www.doi.org/10.1364/OL.486078).

Published: May 2023

Glossary

machine vision: Machine vision, also known as computer vision or computer sight, refers to the technology that enables machines, typically computers, to interpret and understand visual information from the world, much like the human visual system. It involves the development and application of algorithms and systems that allow machines to acquire, process, analyze, and make decisions based on visual data. Key aspects of machine vision include: Image acquisition: Machine vision systems use various...

Browse Cameras & Imaging, Lasers, Optical Components, Test & Measurement, and more.