Image Tokenization for Distributed Neural Cascades

Date: Wednesday, May 21

Start Time: 2:40 pm

End Time: 3:10 pm

Multimodal LLMs promise to bring exciting new abilities to devices! As we see foundational models become more capable, we see compute requirements grow as well. It is not uncommon to see LLMs grow to tens of billions of parameters, at a rate faster than what embedded processors can provide. In this talk, we introduce the concept of a “neural cascade,” a scheme that allows us to divide computation across devices. We’ll present a recipe for constructing a neural cascade from a pre-existing LLM and we’ll show how this system harmonizes edge and cloud devices to enable new experiences.

Track

Session Speakers

Shang-Hung Lin
Vice President of NPU Technology, VeriSilicon

Shang-Hung Lin is the Vice President of Neural Processing Unit (NPU) Technology at VeriSilicon. He has more than 25 years of product innovation and development experience in artificial intelligence, neural networks, computer vision, image signal processing and sensor fusion. He has been granted 50+ US patents in these areas. Shang-Hung received his BS from National Taiwan University and his PhD in Electrical Engineering from Princeton University.

Derek Chow
Software Engineer, Google

Derek Chow is a Software Engineer in the Systems Research Group at Google. With over a decade of experience developing consumer electronics and machine learning systems, he has led the integration of AI into smart home devices that have been deployed in millions of units. Derek received his BS in Mechatronics Engineering from the University of Waterloo.

Image Tokenization for Distributed Neural Cascades

Track

Session Speakers

Shang-Hung Lin

Derek Chow

Sponsors and Exhibitors

Get in Touch

Share