Steve Teig is a visionary technologist and serial entrepreneur whose work has impacted industries ranging from software and semiconductors to biotechnology and machine learning. He has served as founder and/or CTO for multiple companies, and his contributions have been recognized with numerous awards, including an Edison Award and a World Technology Award. He is an inventor on 362 U.S. patents across multiple disciplines, and a well-regarded speaker who has delivered keynotes and invited lectures at conferences and universities around the world. He currently serves as CEO of Perceive, which provides the Ergo® edge AI processor, a purpose-built chip to enable sophisticated neural networks to run within power-constrained devices for a wide range of applications.
Steve Teig is a visionary technologist and serial entrepreneur whose work has impacted industries ranging from software and semiconductors to biotechnology and machine learning. He has served as founder and/or CTO for multiple companies, and his contributions have been recognized with numerous awards, including an Edison Award and a World Technology Award. He is an inventor on 362 U.S. patents across multiple disciplines, and a well-regarded speaker who has delivered keynotes and invited lectures at conferences and universities around the world. He currently serves as CEO of Perceive, which provides the Ergo® edge AI processor, a purpose-built chip to enable sophisticated neural networks to run within power-constrained devices for a wide range of applications.
To reduce the memory requirements of neural networks, researchers have proposed numerous heuristics for compressing weights. Lower precision, sparsity, weight sharing and various other schemes shrink the memory needed by the neural network’s weights, or program. Unfortunately, during network execution, memory use is usually dominated by activations – the data flowing through the network – rather than weights. Although lower precision can reduce activation memory somewhat, more extreme steps are required in order to enable large networks to run efficiently with small memory footprints. Fortunately, the underlying information content of activations is often modest, so novel compression strategies can dramatically widen the range of networks executable on constrained hardware. In this talk, we introduce some new strategies for compressing activations, sharply reducing their memory footprint.