How Large Language Models Are Impacting Computer Vision

Date: Wednesday, May 22

Start Time: 5:25 pm

End Time: 5:55 pm

Large language models (LLMs) are revolutionizing the way we interact with computers and the world around us. However, in order to truly understand the world, LLM-powered agents need to be able to see. Will models in production be multimodal, or will text-only LLMs leverage purpose-built vision models as tools? Where do techniques like multimodal retrieval-augmented generation (RAG) fit in? In this talk, Jacob Marks will give an overview of key LLM-centered projects that are reshaping the field of computer vision and discuss where we are headed in a multimodal world.

Track

Session Speakers

Jacob Marks
Senior ML Engineer and Researcher, Voxel51

Jacob Marks is a Senior ML Engineer and Researcher at Voxel51, where he leads open-source efforts in vector search, semantic search and generative AI for the FiftyOne data-centric AI tool kit. Prior to joining Voxel51, Jacob worked at Google X (now X Development, doing business as X), Samsung Research and Wolfram Research. In a past life, he was a theoretical physicist: in 2022 he completed his PhD at Stanford, where he investigated quantum phases of matter.

How Large Language Models Are Impacting Computer Vision

Track

Session Speakers

Jacob Marks

Sponsors & Exhibitors

Get in Touch

Share