Accelerating Optical Communications with AI By Chris Edwards
Communications of the ACM, July 2023, Vol. 66 No. 7, Pages 13-15 10.1145/3595957
Photonic computing has seen its share of research breakthroughs and deep research winters, much like the history of artificial intelligence (AI). Now, with the resurgence of AI, the huge amounts of energy today's large neural-network models need when running on electronic computers is reawakening interest in uniting the two.
More than 30 years ago, during one of the booms in research into artificial neural networks, Demetri Psaltis and colleagues at the California Institute of Technology demonstrated how techniques from holography could perform rudimentary face recognition. The team members showed they could store as many as one billion weights for a two-layer neural network using the core elements from a liquid-crystal display. Similar spatial light modulators became the foundation of several attempts to commercialize optical computing technology, including those by U.K.-based startup Optalysys, which has focused in recent years on applying the technology to accelerating homomorphic encryption to support secure remote computing.
Though some groups are using spatial light modulators for AI, they represent just one category of optical computer suitable for the job. There are also decisions as to which form of neural networks best suits optical computing. Some techniques focus on the matrix-arithmetic operations of mainstream deep learning pipelines, while others focus squarely on emulating the spike trains of biological brains.
What all the proposed systems have in common is the possibility that, by using photons to communicate and calculate, they will deliver major advantages in density and speed over systems based purely on electrical signaling. A 2021 study of inferencing based on matrix arithmetic by Mitchell Nahmias, now CTO of startup Luminous Computing, and colleagues at Princeton University argued the theoretical efficiency of AI inferencing in the optical domain could far surpass that of conventional accelerators based on existing electronics-only architectures.
The key issue for machine learning is the amount of energy needed to move data around accelerators. Electronic accelerators often employ strategies to cache as much data as possible to reduce this overhead, with major trade-offs concerning whether temporary results or weights are held in the cache depending on the model's structure. However, the energy cost of delivering photons over distances larger than the span of a single chip is far lower than it is for electrical signaling.
A second potential advantage of photonic AI comes from the ease with which it can handle complex operations in the analog domain, though the power savings achievable here are less certain than for communications. Whereas matrix arithmetic relies on highly parallelized hardware circuits for performance in conventional systems, simply passing photons through an optical component such as a Mach-Zehnder interferometer (MZi) or micro-ring resonator will perform arithmetic requiring hundreds or even thousands of logic gates in a digital circuit.
In the MZi, coherent beams of light pass through a succession of couplers and phase shifters. At each coupling point, the interference between the beams results in phase shifts that can be interpreted as part of a matrix multiplication. A 4x4 matrix operation requires just four inputs that feed into six coupling elements, with four output ports providing the result. The speed of the operations is limited only by the rate at which coherent pulses can be passed through the array.
In analog architectures, noise presents a significant hurdle. Work by numerous groups on accelerating inference operations has shown that deep neural networks can work successfully at an effective resolution of 4 bits, but hardware overhead and energy rise quickly as resolution increases. These effects may limit the practical energy advantage of photonic designs.
Estimates by Alexander Tait, assistant professor at Queen's University, Ontario, found the potential power-savings easily eroded by the practical limitations of today's optical devices. Tait calculated just 500 micro-ring resonators acting as neurons in a fully connected layout could fit onto a single 1cm2 die using early 2020s technology. But operating at 10GHz, it would require a kilowatt of power. Tait stresses the example shows the impact of the current need for heaters to tune optical properties. Scaling and design changes could bring the energy down dramatically. "The heaters are certainly a solvable problem," he says. .... '
No comments:
Post a Comment