Increasingly, machine learning models are being deployed at the edge, and these models are getting bigger. As a result, we are hitting the constraints of edge devices – bandwidth, performance and power. One way to reduce ML computation demands and increase power efficiency is quantization – a set of techniques that reduce the number of bits needed, and hence reduce bandwidth, computation and storage requirements. Qualcomm® Snapdragon™ SoCs provide a robust hardware solution for deploying ML applications in embedded and mobile devices. Many Snapdragon SoCs incorporate the Qualcomm Artificial Intelligence Engine, comprised of hardware and software components to accelerate on-device ML. In this talk, we will explore the performance and accuracy offered by the accelerator cores within the AI Engine. We will also highlight the tools and techniques Qualcomm offers for developers targeting these cores, utilizing intelligent quantization to deliver optimal performance with low power consumption while maintaining algorithm accuracy.