Software Reinvents Machine Vision

MOJTABA HOSSEINI, PLEORA TECHNOLOGIES INC.

When technology entrepreneur Marc Andreessen famously wrote “software is eating the world,” he signaled this technology’s increasing power to disrupt existing companies and entire industries. While Andreessen cited numerous examples where software revolutionized the retail, consumer electronics and telecom sectors, today a similar reinvention is playing out in the machine vision market.

Thanks to advances in software and the widespread deployment of vision systems in industrial settings, designers can now harness data from countless users to build self-learning imaging systems. Software’s power to transform business models — backed by a global army of cameras and sensors capturing optical data — could also leave some traditional machine vision companies struggling to find the right path forward (Figure 1).

Figure 1. Convolutional neural nets have found uses in many applications, such as face detection, image classification, natural language processing and autonomous driving. Courtesy of Pleora Technologies Inc.

Machine learning in the cloud

By building its businesses on software, Uber has revolutionized its industry without owning the expensive infrastructure required by traditional taxi operators. New players in the machine vision industry are following a similar path, with software-based services that bypass the expense and limitations of hardware.

With access to software application programming interfaces, such as Amazon’s Web Services or Microsoft’s Azure, users can access globally distributed, cost-effective and almost unlimited cloud computing services without owning any infrastructure. For the machine vision industry, inexpensive cloud computing means algorithms that were once computationally too expensive due to dedicated infrastructure requirements are now affordable. These cost advantages have in part helped speed the emergence of deep convolutional neural networks (CNNs) as a powerful and flexible approach to solving difficult machine vision problems in comparison to hard-wired algorithms.

In machine vision applications such as object recognition, detection and classification, the “learning” portion of the process that once required vast computing resources can now happen in the cloud versus a dedicated, owned and expensive infrastructure. The processing power required for the imaging system to accurately and repeatedly simulate human understanding, learn new processes and identify and even correct flaws is now within reach for any developer.

Access to cloud computing software also allows for greater design flexibility when choosing sensors and optics for inspection systems. With “for hire” computing resources eliminating the need to own processing hardware, and access to powerful new algorithms such as CNNs, designers can employ higher-resolution sensors, a very large number of lower-resolution sensors, or sensors that generate greater amounts of data, such as 3D sensors.

The new machine vision business model

The newfound access to unbridled computing power is already ushering in a new machine-vision-inspection-as-a-service business model. For example, manufacturers can rely on inexpensive sensors or existing legacy imaging equipment that feed data to a third-party cloud-based system that provides comprehensive real-time inspection and analysis. Using a deep learning approach involving CNNs, data from a global source of events can help quickly train the computer model to identify objects, defects and flaws.

Machine vision systems typically rely on a complex arrangement of lighting, cameras and sensors, and image triggering for inspection applications. With cloud-based machine vision as a service, manufacturers can use a less controlled environment that provides “good enough” images for centralized inspection. This helps drive down system costs and potentially leads to a new business model where manufacturers pay per units inspected or number of detected defects. With this approach, smaller manufacturers could take advantage of sophisticated analysis without requiring expensive inspection systems. Thanks to the cloud, rather than “owning” the object recognition capabilities, end users can access data shared across millions of systems and devices to help support continual process and production improvements (Figure 2).

Figure 2. With cloud-based image analysis, manufacturers can access insight from a global base of users to improve processes. Courtesy of Pleora Technologies Inc.

Similarly, software is enabling the gamification of image analysis and object recognition with apps that reach a global base of users who tag, identify and highlight defects. This data then supports the learning process for the machine vision algorithm. Gamers earn points by identifying defects in bottles in thousands of line scan images, or identifying patterns in human cell samples, while researchers access fast, inexpensive data tagging and identification required for supervised machine learning approaches to image-based defect detection. Each image is reviewed by multiple users, with software analyzing results and aggregating data to identify gamers with the best ability to provide fast and accurate analysis while excluding outliers. Thousands of users can be working around the clock to tag images of defective products, such as choosing an image of a bottle with a crack or drawing a line along a crack. The data from the global base of gamers feeds the ever-improving learning for the CNN-based algorithm for inspection

The growing emphasis on data will also impact vision system design. Cameras and sensors will still play an important role in collecting the vast amounts of data required to support cloud-based analysis. The need to transport large amounts of sensitive, high-bandwidth imaging data to the cloud will drive development in mathematically lossless compression, encryption and security, and higher bandwidth wireless video interfaces. Cloud-based image analysis also eliminates the need for nearby hardware for processing and analysis, paving the way for more portable, compact imaging systems.

While processing hardware becomes a commodity, or poses an infrastructure expense challenge, data ownership and analytics services are of increasing importance. It’s no longer enough to sell a good algorithm; the real value is a large data set that verifies the domain-specific algorithm and supports a migration toward self-learning imaging systems.

Other industries are quickly heading in this direction. For example, Facebook is free for end users but the detailed insight and analytics collected by the platform is highly valued by marketers. Similarly in machine vision, owning the large dataset specific to domain may be of more value than owning a specific algorithm or know-how to detect defects.

Open source, free or inexpensive software, plus a large group of users from different markets sharing code and best practices, provides machine vision system designers with cost-effective access to a wide range of powerful development tools. The tools are typically generic and can be applied to a large variety of applications and domains. CNNs have found uses in many applications, such as face detection, image classification, natural language processing and autonomous driving, even though the underlying approach is the same and tools are shared.

Lower barriers to entry Similar to examples from other markets, in machine vision this wider access to open source tools has lowered the barrier to entry for new players while disrupting incumbents that have made significant investments in developing in-house tools. For example, free open source packages that provide access to powerful software libraries for vision-based algorithms — such as Google’s neural network tool TensorFlow, Berkeley’s Caffe for CNNs, or OpenCV — remove the requirement to license expensive proprietary tools.

As an added benefit, with a wider base of contributors from different domains and markets, these open source toolkits and algorithm implementation often advance more quickly than an in-house or licensed approach. Similar to cloud-based inspection, new business models supporting machine vision inspection software as a service may soon dislodge specific algorithm or toolkit sales.

Perkins Precision Developments - Custom Laser Mirrors MR 4/24

Object detection software is also playing a key role in bringing machine vision to the masses. Until recently, machine vision was primarily confined to large factory floors. By reducing infrastruture expenses, small-scale manufacturers can more easily take advantage of advanced inspection processes for limited production runs. Imaging expertise is also moving into almost every market. In today’s state-of-the-art operating room, surgeons use a real-time network of cameras, sensors and displays to precisely navigate robotic surgical tools that minimize damage to healthy tissue, improve results and speed recovery. On the battlefield, ruggedized imaging systems improve intelligence and surveillance while keeping troops out of harm’s way.

Machine vision object detection is even migrating into our daily lives as consumers. Smartphone cameras have face- and smile-detection capabilities that rival the performance of the most sophisticated vision systems from 10 years ago. The high-resolution, small form factor, lower-cost sensors and embedded processing in smartphones will influence the design of smaller, smarter cameras used for inspection on the factory floor. Smartphones are also increasingly becoming part of the inspection system, primarily as an endpoint for machine vision video. In these systems, the high-quality real-time vision standard video required for processing and analysis can be converted into the H.264 compression format and streamed to a mobile device for remote viewing.

While machine vision object detection becomes mainstream technology on the factory floor and moves into new markets, ease-of-use becomes a key factor. This new type of consumer for imaging systems requires a more elegant, straightforward user interface — in line with what they’re used to on the web, smartphone or tablet — compared with the highly specialized, trained vision system expert user on a factory floor. For a smaller manufacturer requiring a basic inspection application, designers can expose a user interface with intuitive, nontechnical and commonly used functions while hiding the complexity and wide range of options provided to expert users.

Impact on design

These advancements in cloud-based service, open source tools and sophisticated yet easier-to-use imaging systems can have a significant impact on machine vision inspection system design.

In particular, as software-based CNNs gain in importance for many recognition and analysis applications, there is increasing need for hardware-accelerated CNNs in graphics processing units (GPUs) and field-programmable gate arrays (FPGAs) to more efficiently perform some functions. Again, machine vision algorithms and processes that were too computationally expensive not long ago benefit from the widespread availability of cost-effective GPU, FPGA and digital signal processor (DSP) cores alongside traditional central processing units (CPUs).

The traditional architecture of machine vision applications is also changing, thanks in part to the evolution toward analysis software running on embedded devices and advancements in cloud-based analysis. Previously, vision system designers had to transport high-bandwidth, low-latency video from imaging sources to a large PC for processing and analysis. Today, increasingly sophisticated processing can happen very close to the image source or at the sensor itself, in the form of a smart camera or an embedded device.

The processor is the “brains” of an embedded system that performs a dedicated function within a larger mechanical or industrial setting. Where a typical PC is designed with the flexibility to support a range of functions for end users, an embedded system is dedicated to one particular task, often with little or no user interface. In robotics, most tasks or processes that are automated and repeated, including object detection, are good candidates to be handled by an embedded processor.

Embedded systems are now available that offer very high processing capabilities in an extremely small form factor. This means processing intelligence for the vision system can be located at different points in the network — in a roadside cabinet, up a gantry, or within a camera. As part of a machine vision system, smaller form-factor-embedded computing platforms deliver considerable weight and footprint advantages, while power efficiencies help lower operating costs and reduce heat output to prevent the premature failure of other electronic components. The end result is increased system design flexibility, an upward shift in intelligence at various points in the network, increased performance, and lower costs (Figure 3).

Figure 3. The combination of embedded systems offering very high processing capabilities in an extremely small form factor and cloud-based analysis deliver considerable weight, footprint and power advantages for robotic vision systems. Courtesy of Pleora Technologies Inc.

As part of the design process, system integrators need to ensure that the software development kit (SDK) they are using to interface to the imaging source can run on the embedded processor. An embedded system based on an ARM processor, for example, does not have a vision-specific interface but it does support Ethernet. Another significant design consideration is ensuring the image processing algorithm can run on the embedded processor. Libraries, such as OpenCV, are available for both ARM and PowerPC.

Standards and software also go hand-in-hand to help lower costs and simplify design of increasingly complex imaging applications by enabling a “Lego block” design approach. As an example, the GenICam standard provides a straightforward approach to configure a camera or imaging source, while the GigE Vision and USB3 Vision standards have powered the development of off-the-shelf video interfaces that transport high-bandwidth, low-latency video over an Ethernet or USB 3.0 cable (see table). In tandem, these standards allow designers to quickly develop and integrate imaging systems without needing to develop or configure all of the components involved.

The basics of the genicam, gige vision and usb3 vision standards

The basics of the genicam, gige vision and usb3 vision standards

Engineering approaches such as continuous delivery (CD) are also having an impact on vision system design. Instead of slower development cycles and fully finished products, a CD approach, coupled with an Agile development philosophy, supports the design of software-based prototypes that disrupt the market with “good enough” solutions. This affords designers the flexibility to improve quality or meet market demands, take advantage of hardware acceleration, or potentially develop a lower-cost solution once they fully capture a market. Coupled with a cloud-based approach to learning from new data sets, software algorithms can be continually updated and improved to drive efficiency higher at every iteration.

Meeting the challenge Object detection is transforming from niche machine vision technology serving a select market to a ubiquitous solution serving an ever-expanding set of applications — extending from the factory floor, to sterile hospital operating rooms, and dusty battlefields. Inspection systems that previously required very large capital expenditures, application-specific image-based inspection using domain-specific tools and algorithms, are now available to smaller manufacturers. With open source and free software tools, these manufacturers can access increasingly sophisticated inspection and analysis to help streamline operations, introduce efficiencies, and reach market faster.

For designers, understanding the potential for software and its impact on system design and usability will be key as they focus on delivering vision solutions that support faster, more complex detection and analysis in an easy-to-use experience. While the pace of innovation may overwhelm some market incumbents, there are growing opportunities for new entrants focusing on leveraging the cost and scalability advantages of software to help drive new process and cost improvements for end users.

Meet the author

Mojtaba Hosseini is manager of product architecture and technology with Pleora Technologies Inc. in Ottawa, Ontario, Canada; email: [email protected].

About Pleora Technologies Inc.