Imagine a world where AI systems automatically detect thefts in grocery stores, ensure construction site safety and identify patient falls in hospitals. This is no longer science fiction, as companies today are building powerful applications that integrate visual content with textual data to understand context and act intelligently. In this talk, we will delve into vision-language models (VLMs), the core technology behind these intelligent applications, and introduce the Pentagram framework, a structured approach to prompt engineering that significantly improves VLM accuracy and effectiveness. We’ll show, step-by-step, how to use this prompt engineering process to create an application that uses a VLM to detect suspicious behaviors such as item concealment in grocery stores. We’ll also explore the broader applications of these techniques in a variety of real-world scenarios. Join us to discover the possibilities of vision-language models and learn how to unlock their full potential.