Date: Tuesday, May 23
Start Time: 1:30 pm
End Time: 2:00 pm
In this talk, we will explore methods to transform large deep neural network (DNN) models into effective compact models. The transformation process that we focus on—from large to compact DNN form—is referred to as pruning. Pruning involves the removal of neurons or parameters from a neural network. When performed strategically, pruning can lead to significant reductions in computational complexity without significant degradation in accuracy. It is sometimes even possible to increase accuracy through pruning. Pruning provides a general approach for facilitating real-time inference in resource-constrained embedded computer vision systems. We will provide an overview of important aspects to consider when applying or developing a DNN pruning method and present details on a recently introduced pruning method called NeuroGRS. NeuroGRS considers structures and trained weights jointly throughout the pruning process and can result in significantly more compact models compared to other pruning methods.