Date: Thursday, September 17, 2020
Start Time: 10:00 am
End Time: 10:30 am
In the race to power efficiency for neural network processing, optimizing memory use to reduce data traffic is critical. Many processors have a small local memory (typically SRAM) used as a scratch pad which can be used to reduce the expensive data traffic to and from a big remote memory (e.g., DRAM). The specific structure of neural networks allows for advanced optimization techniques to optimize the use of the local memory. In this presentation we describe the key aspects of memory management optimization for neural networks along with the trade-offs that must be managed in light of the processor architecture and the details of the network. In addition, we will show the importance of tailoring the memory management approach to the specific network, illustrated by analysis of a case study.