Close

Search

Search Menu
Photonics Media Photonics Buyers' Guide Photonics Spectra BioPhotonics EuroPhotonics Vision Spectra Photonics Showcase Photonics ProdSpec Photonics Handbook

Cameras Can Record Object Density More Accurately with New Crowd-Counting Approach

Facebook Twitter LinkedIn Email Comments
Crowd counting — the process of obtaining information on the density or number of objects such as vehicles or people — can benefit from the same deep learning techniques that have been used for image and video processing. Scientists at Japan Advanced Institute of Science and Technology (JAIST), in collaboration with researchers at Sirindhorn International Institute of Technology (SIIT) in Thailand, developed a way to achieve higher performance in crowd counting by using a backward connection in a deep neural network (DNN).

The estimation network proposed by the researchers consists of two identical networks for extracting a high-level feature and estimating the final result. To preserve semantic information, dilated convolution is used without resizing the feature map.

Instead of using a normal skip connection or a forward connection, the researchers use a backward connection that extracts feature maps from the deeper layer to a shallow layer. This helps the shallow layer to recognize the characteristic of the target in advance. False positives are reduced before the density map is formulated. Objects with small and large scales are correlated to shallow and deep layers, respectively.

An example of a density map obtained from an image in TRANCOS dataset. Courtesy of JAIST.

An example of a density map obtained from an image in TRANCOS data set. Courtesy of JAIST.

To ensure the quality of a density map, feature maps in every layer should have the same resolution, the researchers said. In the team’s approach, dilated convolution is used in skip-network to increase the receptive field sizes while keeping the information of high-level features for a feature map integration in the skip connection. The receptive field with a dilated convolution layer will grow exponentially while preserving the resolution of the feature map.

The researchers tested their method in three data sets for counting humans and vehicles in a crowd image. They evaluated the counting performance by mean absolute error and root mean squared error to indicate the accuracy and robustness of the network, respectively. The experimental results showed that the network outperformed other related networks in a high crowd density and could be effective for reducing overcounting errors.

Crowd counting is a challenging task dealing with variations in object scale and crowd density. Existing approaches emphasize skip connections by integrating shallower layers with deeper layers where each layer extracts features in a different object scale and crowd density. In these approaches only high-level features are emphasized, while low-level features are ignored, the researchers said. The new DNN with a backward connection could achieve a more accurate estimation of the density of objects, and could be applied for estimating human density in public spaces or vehicle density on a road in order to improve public safety, security, and traffic efficiency.

“Backward connection in DNN enables [us] to take advantages obtained from both high-level feature and low-level feature in an image, and therefore achieves higher performance than before,” professor Atsuo Yoshitaka, head of Yoshitaka lab, said. The Yoshitaka lab is currently developing different kinds of DNNs for industrial applications such as object detection and identification in micrograph and defect detection for industrial products.

The research was published in the Journal of Imaging (www.doi.org/10.3390/jimaging6050028).   

Vision-Spectra.com
Jul 2020
GLOSSARY
machine vision
Interpretation of an image of an object or scene through the use of optical noncontact sensing mechanisms for the purpose of obtaining information and/or controlling machines or processes.
Research & TechnologyeducationAsia-PacificJapan Advanced Institute of Science and Technologyimagingneural networksmachine visionSensors & Detectorssmart camerascrowd countingobject identificationindustrialdeep neural networks

Comments
Submit a Feature Article Submit a Press Release
Terms & Conditions Privacy Policy About Us Contact Us
Facebook Twitter Instagram LinkedIn YouTube RSS
©2020 Photonics Media, 100 West St., Pittsfield, MA, 01201 USA, [email protected]

Photonics Media, Laurin Publishing
x We deliver – right to your inbox. Subscribe FREE to our newsletters.
We use cookies to improve user experience and analyze our website traffic as stated in our Privacy Policy. By using this website, you agree to the use of cookies unless you have disabled them.