Date: Wednesday, May 22
Start Time: 4:50 pm
End Time: 5:20 pm
In this presentation, we will explore the challenges of accuracy and performance when implementing quantized machine learning inference algorithms on embedded systems. We’ll explain how the thoughtful use of fixed-point data types yields significant performance and efficiency gains without compromising accuracy. And we’ll explore the need for modern SoCs to not only efficiently run current state-of-the-art neural networks but also to be able to adapt to future algorithms. This requires industry to shift away from the approach of adding custom fixed-function accelerator blocks adjacent to legacy architectures and toward embracing flexible and adaptive hardware. This hardware flexibility not only allows SoCs to run new networks, but also enables ongoing software and compiler innovations to explore optimizations such as better data layout, operation fusion, operation remapping and operation scheduling without being constrained by a fixed hardware pipeline.