Practical Approaches to DNN Quantization

Date: Wednesday, May 24

Start Time: 1:30 pm

End Time: 2:35 pm

Convolutional neural networks, widely used in computer vision tasks, require substantial computation and memory resources, making it challenging to run these models on resource-constrained devices. Quantization involves modifying CNNs to use smaller data types (e.g., switching from 32-bit floating-point values to 8-bit integer values). Quantization is an effective way to reduce the computation and memory bandwidth requirements of these models, and their memory footprints, making it easier to run them on edge devices. However, quantization does degrade the accuracy of CNNs. In this talk, we survey practical techniques for CNN quantization and share best practices, tools and recipes to enable you to get the best results from quantization, including ways to minimize accuracy loss.

Track

Session Speakers

Dwith Chenna
Senior Embedded DSP Engineer, Computer Vision, Magic Leap

Dwith Chenna is a research and development professional with a strong focus on algorithm development and optimization in the fields of computer vision, deep learning and human computer interaction. He has extensive experience in developing state-of-the-art, performance-critical perception systems, and a deep understanding of the complexities involved in developing and optimizing deep learning models on resource-constrained hardware, such as digital signal processors. Dwith’s responsibilities include evaluating embedded algorithms for performance and accuracy, and driving key performance metrics such as latency, memory, bandwidth and power consumption—often through integration and development of tooling and automation. He is also responsible for quantizing, optimizing and tuning the performance of deep learning models.

Practical Approaches to DNN Quantization

Track

Session Speakers

Dwith Chenna

See you May 21 - 23, 2024 at the Santa Clara Convention Center!

Sponsors & Exhibitors

Get in Touch

Share