Presented by:
(Looking for the Advanced course? Click the tab “Advanced Course Description” above!)
Are you an engineer, developer or engineering manager eager to harness the power of generative AI for cutting-edge computer vision applications? Join us for an intensive three-hour training session designed to introduce the latest techniques in vision-language models (VLMs) and their integration with traditional computer vision methods. With a focus on the practical application of these techniques for real-world use cases, this course is tailored for professionals looking to expand their skill set in AI-driven computer vision, particularly in systems designed for deployment at the edge.
What You’ll Learn
Introduction to VLMs and LLM+Computer Vision Techniques with Phil Lapsley, Vice President, Edge AI and Vision Alliance:
Technical Deep Dive with Dr. Satya Mallick, CEO of OpenCV:
Who Should Attend
This training is ideal for engineers, developers, engineering managers and CTOs with a basic understanding of Python, Jupyter Notebook and computer vision concepts. Whether you’re working in mobile development, embedded systems or cloud applications, this course will provide you with the tools and knowledge to implement sophisticated AI solutions in your projects.
To make the most out of this training, you should have:
Why Attend?
The field of generative AI and multimodal LLMs is moving at a truly breakneck pace. This course offers a great way to keep up with this rapidly evolving technical landscape. In particular, it provides a unique blend of foundational knowledge and practical applications, ensuring you leave with actionable skills and access to sample code for continued learning.
Register Today
Registration is $495. Don’t miss this opportunity to enhance your skills and stay at the forefront of computer vision technology. Register today to secure your spot in this transformative training session. (You can save $50 if you also register for the Embedded Vision Summit!)
(Looking for the Intro course? Click the tab “Intro Course Description” above!)
Ready to move beyond CLIP-style “image + caption” alignment and build vision-language systems that reason over time and take actions? This live, hands-on advanced training dives into modern VLMs designed for video understanding, multimodal reasoning, and agentic behavior. You’ll learn how today’s “thinking” vision models differ from classic VLM pipelines—and you’ll implement practical patterns for building applications that observe, reason, decide, and adapt.
This session is taught by Satya Mallick, CEO of OpenCV, and focuses exclusively on advanced, real-world capabilities using state-of-the-art models such as Qwen3.5, GLM-4.1V-Thinking, and LLaVA-NeXT.
This is an advanced course. If you haven’t taken the introductory VLM session, we strongly recommend it—or ensure you meet the prerequisites below.
What You’ll Learn
Advanced VLMs: From Perception to Reasoning to Decision
We’ll start by reframing what “modern VLMs” are actually good at—especially where classic approaches break:
Reasoning VLMs (and What “Thinking” Really Means)
Get a clear, engineer-focused understanding of:
Video VLMs and Temporal Reasoning
You’ll learn and implement techniques to handle the hard parts of video understanding:
Agentic VLMs: Observe → Reason → Decide → Act → Remember
Go from “model outputs” to systems with behavior:
Who Should Attend
This workshop is ideal for engineers, developers, and technical leads who want to build the next generation of:
Prerequisites
To get the most from the hands-on work, you should be:
Why Attend?
Multimodal AI is rapidly shifting from static perception to temporal reasoning and autonomous decision-making. This course helps you keep pace with where VLMs are going—while staying grounded in implementable techniques, real tradeoffs, and working applications you can build on after the session.
Register Today
Registration is $495. Don’t miss this opportunity to enhance your skills and stay at the forefront of computer vision technology. Register today to secure your spot in this transformative training session. (You can save $50 if you also register for the Embedded Vision Summit!)

Interested in sponsoring or exhibiting?
The Embedded Vision Summit gives you unique access to the best qualified technology buyers you’ll ever meet.
Want to contact us?
Use the small blue chat widget in the lower right-hand corner of your screen, or the form linked below.
STAY CONNECTED
Follow us on X and LinkedIn.