Optics Accelerates Deep Learning Computations on Smart Devices

MIT researchers have created a method for computing directly on smart home devices that drastically reduced the latency that may cause such devices to be delayed in offering a response to a command or an answer to a question. One reason for this delay is that the connected devices don’t have enough memory or power to store and run the enormous machine learning models necessary for the device to understand a question. Instead, the question is sent to a data center that can be hundreds of miles away where an answer is computed and sent back to the device.

The MIT researchers’ technique shifted the memory-intensive steps of running a machine learning model to a central server where components of the model are encoded onto lightwaves. The waves are transmitted to a connected device using fiber optics, which enables large quantities of data to be sent at high speeds through a network. The receiver then used a simple optical device that rapidly performed computations using the parts of a model carried by those lightwaves.

The technique led to a more than hundredfold improvement in energy efficiency when compared to other methods. It could also improve security, since a user’s data does not need to be transferred to a central location for computation.

Further, the method could enable a self-driving car to make decisions in real time while using just a tiny percentage of the energy currently required by power-hungry computers. It could also be used for live video processing over cellular networks, or even enable high-speed image classification on a spacecraft millions of miles from Earth.

Senior author Dirk Englund, an associate professor in the Department of Electrical Engineering and Computer Science (EECS), as well as a member of the MIT Research Laboratory of Electronics, said, “Every time you want to run a neural network, you have to run the program, and how fast you can run the program depends on how fast you can pipe the program in from memory. Our pipe is massive — it corresponds to sending a full feature-length movie over the internet every millisecond or so. That is how fast data comes into our system. And it can compute as fast as that.”

A smart transceiver uses silicon photonics technology to dramatically increase one of the most memory-intensive steps of running a machine learning model. This can enable an edge device, such as a smart home speaker, to perform computations with more than a hundredfold improvement in energy efficiency. Courtesy of Alexander Sludds.

According to lead author and EECS grad student Alexander Sludds, the process of fetching data — the “weights” of the neural network, in this case — from memory and moving it to the parts of a computer that do the actual computation is one of the biggest limiting factors to speed and energy. “So, our thought was, why don’t we take all that heavy lifting — the process of fetching billions of weights from memory — move it away from the edge device and put it someplace where we have abundant access to power and memory, which gives us the ability to fetch those weights quickly?” Sludds said.

To address the data retrieval process, the team developed and deployed a neural network. Neural networks can contain billions of weight parameters, which are numeric values that transform input data as it is processed. These weights must be stored in memory. At the same time, the data transformation process involves billions of computations, which require a great deal of power to perform.

The neural network architecture that the team developed, Netcast, involves storing weights in a central server connected to a smart transceiver. The smart transceiver, a thumb-size chip that can receive and transmit data, uses silicon photonics to fetch trillions of weights from memory each second. Weights are received as electrical signals and subsequently encoded onto lightwaves. Since the weight data is encoded as bits — 1s and 0s — the transceiver converts them by switching lasers. A laser is turned on for a 1 and off for a 0. It combines these lightwaves and then periodically transfers them through a fiber optic network so a client device doesn’t need to query the server to receive them.

Once the lightwaves arrived at the client device, a broadband Mach-Zehnder modulator used them to perform superfast analog computation. This involved encoding input data from the device, such as sensor information, onto the weights. Then, it sent each individual wavelength to a receiver that detected the light and measured the result of the computation.

The researchers devised a way to set the modulator to do trillions of multiplications per second. This vastly increased the speed of computation on the device while using only a tiny amount of power.

“In order to make something faster, you need to make it more energy efficient,” Sludds said. "But there is a trade-off. We’ve built a system that can operate with about a milliwatt of power but still do trillions of multiplications per second. In terms of both speed and energy efficiency, that is a gain of orders of magnitude.”

The researchers tested the architecture by sending weights over an 86-km fiber connecting their lab to MIT Lincoln Laboratory. Netcast enabled machine learning with high accuracy — 98.7% for image classification and 98.8% for digit recognition — at rapid speeds.

Now, the researchers want to iterate on the smart transceiver chip to achieve even better performance. They also want to miniaturize the receiver, which is currently the size of a shoebox, to the size of a single chip. This would enable the chip to fit onto a smart device like a cellphone.

Euan Allen, a Royal Academy of Engineering Research fellow at the University of Bath who was not involved with this work, said, “Using photonics and light as a platform for computing is a really exciting area of research with potentially huge implications on the speed and efficiency of our information technology landscape. The work of Sludds et al. is an exciting step toward seeing real-world implementations of such devices, introducing a new and practical edge-computing scheme whilst also exploring some of the fundamental limitations of computation at very low (single-photon) light levels.”

The research is funded, in part, by NTT Research, the National Science Foundation, the Air Force Office of Scientific Research, the Air Force Research Laboratory, and the Army Research Office.

The research was published in Science (www.doi/10.1126/science.abq8271).