Where: Mission City B1-B5
Start Time: 10:45
End Time: 11:15
Deep learning on embedded devices is currently enjoying significant success in a number of vision applications – particularly smartphones, where increasingly prevalent AI cameras are able to enhance every captured moment. However, the considerable number of deep learning network architectures proposed every year has led to real challenges for software developers who need to implement these demanding algorithms very efficiently.
In this presentation, we present a structured approach for performance analysis of deep learning software implementations. We examine the fundamentals of performance analysis for deep learning, presenting metrics and methodologies. We then show how our top-down approach can be used to detect and fix performance bottlenecks, creating efficient deep neural network software implementations. And, we illustrate typical software optimizations that can be used to make the best use of available computational resources.