Date: Thursday, May 23
Start Time: 2:05 pm
End Time: 2:35 pm
In this talk, we will explore the use of multimodal large language models in real-world edge applications. We will begin by explaining how these large multimodal models (LMMs) work and highlighting their key components, giving special attention to how LMMs merge understanding in the vision and language domains. Next we’ll discuss the process of training LMMs and the types of data needed to tune them for specific tasks. Finally, we’ll highlight some of the key challenges in deploying LMMs in resource-constrained edge devices and share techniques for overcoming these challenges.