State-of-the-Art Model Quantization and Optimization for Efficient Edge AI

Date: Wednesday, May 24

Start Time: 12:00 pm

End Time: 12:30 pm

Extremely efficient edge AI requires more than efficient processors; it also requires tools capable of generating superefficient software. In this talk, we’ll explain and demonstrate how DEEPX’s DXNN SDK utilizes state-of-the-art optimization techniques to generate extremely efficient, accurate code for DEEPX’s new M1 neural processor. We’ll begin by describing how the DXNN SDK uses hardware-aware, selective quantization to maintain high accuracy while achieving efficient DNN implementations. Next, we’ll explain how the SDK maps DNN layer operations into processor micro-operations to provide both efficiency and flexibility. We’ll also show how the DEEPX SDK conserves memory by utilizing tiling, layer fusion and feature reuse. Finally, we’ll illustrate the ease of use of the SDK by demonstrating the use of the DXNN SDK to implement a state-of-the-art model on the M1 NPU.

Track

Session Speakers

Hyunjin Kim
Senior Staff Engineer, DEEPX

Hyunjin Kim is a Senior Staff Engineer in the deep learning compiler team at DEEPX, where he leads both the front-end and back-end compiler teams. His focus is on deep learning inference performance and energy optimization, as well as software quality improvement. Before joining DEEPX, Hyunjin was a team manager for the Exynos Neural Processing Unit framework at Samsung Electronics, where he led his team to improve software performance using a variety of software optimization techniques. At Samsung he also worked on software quality and testability enhancement through the test-driven development and the code-based documentation. Hyunjin earned his PhD in Computer Science at UCLA.

State-of-the-Art Model Quantization and Optimization for Efficient Edge AI

Track

Session Speakers

Hyunjin Kim

See you May 21 - 23, 2024 at the Santa Clara Convention Center!

Sponsors & Exhibitors

Get in Touch

Share