Photonics Spectra BioPhotonics Vision Spectra Photonics Showcase Photonics Buyers' Guide Photonics Handbook Photonics Dictionary Newsletters Bookstore
Latest News Latest Products Features All Things Photonics Podcast
Marketplace Supplier Search Product Search Career Center
Webinars Photonics Media Virtual Events Industry Events Calendar
White Papers Videos Contribute an Article Suggest a Webinar Submit a Press Release Subscribe Advertise Become a Member


Cameras Record Object Density More Accurately

Crowd counting — the process of obtaining information on the density or number of objects such as vehicles or people — can benefit from the same deep learning techniques that have been used for image and video processing. Scientists at Japan Advanced Institute of Science and Technology (JAIST), in collaboration with researchers at Sirindhorn International Institute of Technology (SIIT) in Thailand, developed a way to achieve higher performance in crowd counting by using a backward connection in a deep neural network (DNN).

The estimation network proposed by the researchers consists of two identical networks for extracting a high-level feature and estimating the final result. To preserve semantic information, dilated convolution is used without resizing the feature map.

Instead of using a normal skip connection or a forward connection, the researchers use a backward connection that extracts feature maps from the deeper layer to a shallow layer. This helps the shallow layer to recognize the characteristic of the target in advance. False positives are reduced before the density map is formulated. Objects with small and large scales are correlated to shallow and deep layers, respectively.


An example of a density map obtained from an image in TRANCOS data set. Courtesy of JAIST.

To ensure the quality of a density map, feature maps in every layer should have the same resolution, the researchers said. In the team’s approach, dilated convolution is used in skip-network to increase the receptive field sizes while keeping the information of high-level features for a feature map integration in the skip connection. The receptive field with a dilated convolution layer will grow exponentially while preserving the resolution of the feature map.

The researchers tested their method in three data sets for counting humans and vehicles in a crowd image. They evaluated the counting performance by mean absolute error and root mean squared error to indicate the accuracy and robustness of the network, respectively. The experimental results showed that the network outperformed other related networks in a high crowd density and could be effective for reducing overcounting errors.

Crowd counting is a challenging task dealing with variations in object scale and crowd density. Existing approaches emphasize skip connections by integrating shallower layers with deeper layers where each layer extracts features in a different object scale and crowd density. In these approaches only high-level features are emphasized, while low-level features are ignored, the researchers said. The new DNN with a backward connection could achieve a more accurate estimation of the density of objects, and could be applied for estimating human density in public spaces or vehicle density on a road in order to improve public safety, security, and traffic efficiency.

“Backward connection in DNN enables [us] to take advantages obtained from both high-level feature and low-level feature in an image, and therefore achieves higher performance than before,” professor Atsuo Yoshitaka, head of Yoshitaka lab, said. The Yoshitaka lab is currently developing different kinds of DNNs for industrial applications such as object detection and identification in micrograph and defect detection for industrial products.

The research was published in the Journal of Imaging (www.doi.org/10.3390/jimaging6050028).   

Explore related content from Photonics Media




LATEST NEWS

Terms & Conditions Privacy Policy About Us Contact Us

©2024 Photonics Media