Machine vision used for entertainment applications shows where the technology is going.
Hank Hogan, Contributing Editor
Machine vision applications used to be all work and no play. No longer the case, researchers and vendors now are setting their sights on games, rides and other entertainment, including automated foosball, tables that interact with audiences to play music and augmented reality. There also are vision systems that could replace the physical barriers that control crowds at amusement parks or that make rides safer. A look at the industry reveals the technology behind the fun and games and gives clues about where things are headed.
Vladimir Tucakov, a director at Vancouver-based camera maker Point Grey Research, said that the push into entertainment is not all that surprising. After all, he notes, the vision market often has been divided into industrial and nonindustrial segments. The latter is a big part of the overall pie, which is where interactive games, music videos, rides and other amusements fall.
“I think a lot of the vision suppliers have been in that area for a long time,” Tucakov said about entertainment. “Computing is getting faster and the cameras are getting cheaper, so it is opening up even more new applications.”
The entertainment environment often is uncontrolled, with people free to do the unexpected and the unplanned. That places a burden on the vision system, made all the worse because that variability also can extend to lighting.
On the other hand, entertainment typically doesn’t make the same demands for accuracy as does an industrial application. For example, determining the precise location of someone’s waving hand is not critical, which makes the inherent variability easier to handle. What is vital is that the overall experience be entertaining, an outcome that is likely to be a product of the entire system and not just a consequence of the vision subsystem.
Another factor that makes entertainment different is that artists often like distortions or inaccuracies; for example, creating an effect by exploiting a rolling shutter, which typically is avoided in an industrial setting because it incorrectly renders an image. Such distortions can have entertainment value, in which case, a faithful rendering of a scene actually could be a drawback.
When reality isn’t enough
As an example of what can be done in the entertainment arena, consider two applications that use Point Grey gear: an augmented reality system and another that allows users to interact with an abstract space.
The augmented reality application blends the actual with the virtual, bridging the two with the help of a head-mounted display based on a Point Grey IEEE-1394 0.3-megapixel CMOS digital camera capable of 60 fps. Developed by a group at the GVU Center at Georgia Institute of Technology in Atlanta, it combines the virtual online world in “Second Life” with the actual world of its users.
Researchers at Georgia Institute of Technology have developed a head-mounted display (left) based on a small camera (inset) as part of a system that allows real people to interact with a virtual world (right). The camera captures the real scene, while software merges it with the virtual one.
For such tasks, associate professor and GVU researcher Blair MacIntyre noted, today’s head-mounted display technology leaves something to be desired, largely because of a limited field of view and inadequate resolution. He is working with a company to start an augmented reality subsidiary focused on a handheld implementation of the technology for use with mobile phones and game consoles – less challenging than trying to do the same in a head-worn display.
Augmenting reality with a virtual world imposes certain requirements on the vision system, he pointed out. In particular, the three-dimensional structure of an uncontrolled scene must be captured. “If you put your hand up in front of the camera, but there is supposed to be things in the world farther away than your hand, they should appear behind your hand,” he pointed out.”
“But without knowing that the hand in the video is a certain distance away or without even being able to find the hand in the first place, we can’t hide the graphics that should be behind it,” he said.
Although such 3-D capture is becoming easier, it is by no means trivial. One potential solution is to use two cameras mounted a known distance apart. Both cameras must be fairly light, and the overlay between the two cameras must be carefully controlled. If that is not the case, then 3-D data cannot be extracted.
The second application is courtesy of Squidsoup, a UK-based nonprofit interactive arts organization. A group from Squidsoup created Driftnet, an interactive visual and auditory exhibit that uses an overhead-mounted Point Grey stereo camera based on two CCDs. This provides a 3-D image of the participant and allows for interaction, with movement setting off sounds and altering projections.
Artists mounted a stereo camera (left) above participants and used the camera to capture their motions (middle), allowing them to interact with the Driftnet exhibit of a representation of flying birds. A screen shot (right) shows other representations of the space. Courtesy of Anthony Rowe, Squidsoup.
Squidsoup lead artist Anthony Rowe noted that the group is working on other projects, including one intended for kids that is scheduled to launch at the end of summer. He likes this approach and the technology. “It’s about the best way to create untethered three-dimensional interaction.”
Getting a kick out of games
Interactivity is a theme that runs through several other entertainment applications. The first is the reacTable, a collaborative electronic musical instrument with a tabletop interface developed by a group at Pompeu Fabra University in Barcelona, Spain.
The device makes use of a camera and a projector, with the former tracking the movement and location of fingers and specially tagged control objects across a transparent table. Markers on the control objects uniquely identify them and enable the system to react to their placement and orientation, with an IEEE-1394 camera from Allied Vision Technologies of Stadtroda, Germany, handling that chore. This information affects the output of a projector and a sound system, enabling the playing of music and the display of accompanying visualization. The device can be thought of as a modular synthesizer, with tangible objects its primary input and music and graphics its output.
A new type of synthesizer, the reacTable, was developed by researchers from Pompeu Fabra University in Barcelona, Spain. A machine vision camera mounted below a transparent table reads marks, or fiducials, on objects above. Software adjusts what is projected onto the table and what is played, based on the location and orientation of the objects, enabling collaborative music sessions. Courtesy Pompeu Fabra University, Barcelona, Spain.
Researcher Martin Kaltenbrunner, one of four creators of the reacTable, noted that the markers, or fiducials, and the optical tracking system were developed to be fast enough and robust enough to meet the demands of the application. He also pointed out that audiences react well to the device, a response that he attributes to more than technology. “We believe it is not only the new technologies we use, but, after all, the instrument design in general, which allows the audience to access electronic music in a simple and engaging way.”
The reacTable has been used by Icelandic singer Björk in a world tour and has won various awards. The creators are in the process of forming a spin-off company to bring the device to the market by the end of 2008. They will target professional musicians, recording studios, museums and event organizations at first. Kaltenbrunner, however, noted that the platform is suitable for many uses.
A different kind of interactivity lies behind a robotic foosball game developed by researchers at the University of Freiburg in Germany and commercialized by adp Gauselmann GmbH of Espelkamp. In its original incarnation, the robot used an overhead camera to image the playing surface every 20 ms. From this, the system located the ball and decided what action to take, such as kicking the ball by spinning a rod or blocking a shot by moving a player into position.
The original system used a color camera for imaging, which could have led to ball-recognition problems if the lighting varied. However, the company’s Web site states that the commercial version uses infrared illumination and a special light filtering foil to ensure that only IR from the ball is reflected to the camera. Graduate student Dapeng Zhang noted that the research group now has moved away from cameras altogether. “We use a laser measurement system.”
Get back in line
With regard to amusement rides and similar entertainment, vision technology faces a problem. Entech Creative Industries of Orlando, Fla., creates and builds brand destinations for the retail, theme park, entertainment and museum industries. The company has, for example, designed a roller coaster ride that greets audiences with a burst of flame at particular points along a route. Such rides operate day and night, which can lead to technical issues.
As Bob Hartline, vice president of systems engineering, noted, “Vision systems need certain lighting levels, and they need those levels to be consistent. That doesn’t happen too often with the stuff that we work with.”
There is, however, an area of amusement rides where vision systems can play a role: the physical guarding of such rides. Today this is done with barriers and light curtains, devices that detect someone crossing them by the interruption of a beam of light. Using either approach requires that designers anticipate where people might wander and how they might try to penetrate a perimeter.
A device developed by Pilz GmbH of Ostfildern, Germany, and introduced at the beginning of the year could change that. The company’s SafetyEYE uses three CMOS cameras mounted overhead to monitor the space around an object. If something different enters the monitored zone, the device takes action. Michael Beerman, regional sales manager with Pilz Automation Safety, the company’s Canton, Mich.-based North American subsidiary, noted it originally was developed for industrial uses.
“The purpose was to remove all physical guarding from robotic cells so that you could approach a robot. The robot would slow down, and, eventually, as you continued to approach it, it would shut off,” he said.
The product has attracted the attention of the amusement industry, in part because of its flexibility. From a vantage point above a location, the device segments a space into various zones, with the outer zone causing a warning to be issued and the inner causing a stop. The zones can be changed easily and the actions taken tailored as needed, something difficult to do with a physical barrier. The device has the advantage of being certified as being safe by third parties, an important asset.
These and other examples show how vision technology intended for industrial uses is finding its way into entertainment. If the trend continues, it could be time for the industry to take fun and games seriously.