Presented by:
Are you an engineer, developer, or engineering manager eager to harness the power of generative AI for cutting-edge computer vision applications? Join us for an intensive three-hour training session designed to introduce the latest techniques in vision-language models (VLMs) plus their integration with traditional computer vision methods. With a focus on the practical application of these techniques for real-world applications, this course is tailored for professionals looking to expand their skill set in AI-driven computer vision, particularly in systems designed for deployment at the edge.
What You’ll Learn:
Introduction to VLMs and LLM+CV Techniques with Jeff Bier, Founder of the Edge AI and Vision Alliance: We’ll start with an overview of vision-language models and how they differ from conventional convolutional neural networks. From there we’ll move to understand the advantages and potential drawbacks of integrating LLMs and VLMs with computer vision, and explore real-world applications that benefit from these advanced techniques.
Technical Deep Dive with Satya Mallick, CEO of OpenCV: Gain insights into the basics of VLMs, including embeddings, CLIP, and how different modalities (text, vision) are encoded. Learn about the types of training data required and the loss functions used in these models. This segment will provide the necessary background to tackle the practical examples that follow.
First Hands-On Example: Zero-Shot Image Classification. Our first practical example will be image classification with CLIP for zero-shot learning. You’ll build an image classifier capable of recognizing a wide array of images without prior training. Discover how CLIP’s zero-shot classification can be deployed on mobile devices, and learn how to fine-tune the model for enhanced performance on specific datasets.
Second Hands-On Example: VLM with Agnostic Object Detector. We’ll develop a VLM-based visual AI system that identifies objects and reasons about them using pre-existing world knowledge. We’ll accomplish this by using a CNN-based class-agnostic object detector and integrating it with a VLM to answer complex questions about detected objects.
Who Should Attend:
This training is ideal for engineers, developers, engineering managers, and CTOs with a basic understanding of Python, Jupyter notebooks, and computer vision concepts. Whether you’re working in mobile development, embedded systems, or cloud applications, this course will provide you with the tools and knowledge to implement sophisticated AI solutions in your projects.
To make the most out of this training, you should have:
Why Attend?
The field of generative AI and multimodal LLMs is moving at a truly breakneck pace. This course offers a great way to keep up with this rapidly evolving technical landscape. In particular, it provides a unique blend of foundational knowledge and practical applications, ensuring you leave with actionable skills and access to sample code for continued learning.
Don’t miss this opportunity to enhance your skills and stay at the forefront of computer vision technology. Register today to secure your spot in this transformative training session.
Registration is $495. Don’t miss this opportunity to enhance your skills and stay at the forefront of computer vision technology.
Interested in sponsoring or exhibiting?
The Embedded Vision Summit gives you unique access to the best qualified technology buyers you’ll ever meet.
Want to contact us?
Use the small blue chat widget in the lower right-hand corner of your screen, or the form linked below.
STAY CONNECTED
Follow us on Twitter and LinkedIn.