Date: Thursday, September 17, 2020
Start Time: 9:30 am
End Time: 10:00 am
Due to the great success of deep neural networks (DNNs) in computer vision and other machine learning applications, numerous specialized processors have been developed to execute these algorithms with reduced cost and power consumption. The diverse range of specialized processors becoming available create great opportunities to deploy DNNs in new applications. But they also create challenges, as a DNN topology specifically designed for one processor may not run efficiently on a different processor. For developers of DNNs that run on multiple processor targets, the effort required to optimize the DNN for each processor can be prohibitive. In this talk, we will explain cost-effective techniques that transform DNN layers to other layer types to better fit a specific processor, without the need to retrain from scratch. We also present quantization and structured sparsification techniques which reduce model size and computation significantly. We’ll discuss several case studies in the context of object detection and segmentation.